 It's a very small group, yes. Are we on video already? OK, thank you. Actually, it doesn't matter. Go for it. Yeah. This introduction is the first slide. So you want to say something about yourself? Yeah, well, my name is Bernd Helmner. I'm the technical lead at Creative in the database team. I'm doing Postgres since 2002, starting doing my studies. My favorite projects in Postgres actually are my migration and high availability and performance tuning. Yes. Just being here, the mics are just getting bigger. But it's just the introduction in Europe. So I might need to introduce myself for the video. I think everyone here in the room knows me already. Just very, very briefly, all those dates given is when I started to actively develop software for those projects. So I'm kind of an old-time when it comes to open source. We didn't even have the name open source when I started. But anyway, let's keep the introduction short. Today, I'm more or less busy not doing open source software development, but running a company. Obviously, we both work for that company. And again, I'm not going to read you that slide. It's enough for marketing, right? So when it comes down to power systems, when we just briefly touched the topic already, it's a huge area that we as a community don't have that much contact with. And on the other hand side, all those users don't have much contact with the community here, as we can see from the empty room. Here are some quotes I took from Computer Weekly about power systems in general. And again, you can read them yourself. There are huge advantages, really huge advantages that you can get by switching to power technology. And we will see some of them during this presentation. One is pretty obvious when talking about open power. Let me use a train as an example here. We all know that trains are, well, in general, safer than cars, despite recent events, like slightly north from here. But looking at this train ride, I'm not so sure that that is safer than using a car. Similarly, in the IT, we see security problems coming up from unexpected sources. Like a couple of weeks ago, when Apple found malware on their service in the firmware of their service, which is something you cannot really check, right? It's embedded in your hardware. With open power, it's different. You can check the source code. With those systems, literally everything is open source. The firmware is on, I think, GitHub. And of course, you can use Linux then, well, you should use Linux. So I don't think there's anything else you can do. For our systems and our tests, we stayed with the completely open versions and with little end-in distributions. I guess you all have heard about Debian. The picture is pretty old. But I wanted to just give you a short introduction to what is Debian about. It's about different, obviously, stable is the release of the distribution. But we have the unstable branch where all the development and testing in the distribution goes on, which migrates automatically to testing, which then becomes the next stable. I'm not going to talk about experimental and all those details because we don't need it here. The two important things for Debian in particular is we have backpots. We just had a discussion, like 10 minutes ago, about running the most current Postgres on fairly old CentOS. I'm not even sure how old that one was. But it had version 9.2 in it as a distribution provided Postgres. And it's not that easy to upgrade then. Where Debian offers you backpots in general in the area of the distribution where maintainers just upload the newer version for the old stable release. Yes, Debian doesn't release that often. But you get the newest versions of the more important pieces of software to run on it. And at the same time, since the last release, well, the one that's out at the end of life now, Debian offers long-term support. Or as the other one we used for our test, Ubuntu, which is based on Debian, does as well. The other thing that you should know is different when using the Debian or Ubuntu packages, which are jointly developed, is we do multi-version, multi-cluster. Meaning it's not that you install a package and you have one version of Postgres on your server and that's it. You can run directly out of the package without any manual adjustments. You can run a 96 and a 95 and a 92 at the same time by just installing those packages. All the infrastructure needed is readily available, which has the big, really big advantage that you can do a migration without really doing it in place, essentially. So you can run the old version and then migrate the new version and test the new version all on the same server if you like to. Obviously, a distribution offers you usually at least only one version of a database pack or any package. For Debian or Ubuntu, there's a community-driven project called apppostgresql.org. App is the Debian and Ubuntu package manager. And this is run inside, as you can see, from the top level domain inside the Postgres project. And the idea is to give you packages for all Postgres versions, for all Debian and Ubuntu versions. So whatever you want to have, just use it. It could be a developer version. It could be the latest stable. It could be an older one. For all of them, packages are available just at, well, essentially the click of a mouse or the command line alternative. So far, available only for AMD64, i386, and LittleIndianPower. And also, I have to read that up here. It's stable for Debian, old stable, long-term stable, and unstable. And for Ubuntu, it's long-term stable, one and two, stable, and development. And for all of these, we have the five supported Postgres versions available, plus the current one, one currently in development. Just do the math. You get three to four Debian and three to four Ubuntu releases, six Postgres releases, all the server packages, all extensions, all modules, with three architectures, meaning out of roughly 150 different source packages, we built 16,000 binary packages. And that almost constantly, every change that is done is reflected on those systems. That, as a background of what we used for all the tests, originally, I was planning to talk to you about those tests, too, but then I could get them to join me for the trip. And since he did the evaluation, I think it's better for him to tell you more about those tests. Oh, I think that was on the slide of your orange. OK, as Michael already had lined out, in the Postgres community, it's not only the Postgres database, it's the complete ecosphere. And with the PGR project, you get basically all the helper tools like backup, high availability from one repository. And with the introduction of PPC64, Little Indian, I have the possibility to get Postgres or with all its tools around into the power world. I skipped that because that's not interesting for this audience now. Why was IBM, or why had they decided to do the Little Indian job? Well, the big problem is that big Indian Linux on power doesn't have the adoption at all in the community. You still need a commercial Linux to run on it. You can do a Big Indian too, but, well, the attraction on open source developers wasn't that broad. With Little Indian introduction, they suddenly have much more distributions available on the platform. That's Debian, Obuntu, CentOS7, Fedora, OpenSUSE, all the Little Indian PPC64 packages. And for them, it's easier to do the QS for quality assurance for their ports. Another options are projects which aren't available on Big Indian at all. For example, prominent example, Docker. Docker doesn't run on Big Indian on PPC64. With Little Indian, you get it for free now. I think Michael already mentioned it. Two, IBM decided to open source their freeware one day in the past and to get other manufacturers the possibility to license the power hardware. We get access to a fairly new open power machine provided by the Chairman Thomas Green, AG. It's a single 8-core machine with a complete open freeware. It's called the OPAL environment. This one can be downloaded by GitHub. You can compile it on, for example, your x86 machine. It takes around three hours. And then you can do all your modifications in there and upload the firmware on the server and you can run it. For example, if you want to do some debugging or something like that or code evaluation, it's possible to just cross compile it and to debug it there. The machine provided by Thomas Green has 512 gigabytes of RAM, has a mega-rate SAS controller from Avajo, and was running with Davy and Chessie with a PG app. It's bare metal. It's not virtualized. The open power freeware can be downloaded. I have already mentioned it on GitHub slash open-minus-power. It's adapted by other companies, too. For example, Talos. They provide workstation lineboards with power CPUs. You just need to license the power hardware, and then you can do everything you want, customizing and so on. The OPAL firmware can be virtualized with KVM. You can either do power KVM. It's a specialized IBM KVM distribution, or you can do it on your own. We have also did some performance comparison on our four socket E8 50 machine with eight cores, four CPUs, sorry, and 3.7 gigahertz. This is a little bit faster than the open power machine with 3.32 gigahertz. It has a little bit less RAM, 128 gigabytes. It was running Ubuntu LTS, but the main opposite is this was running virtualized on PowerVM. PowerVM is the commercial virtualization on the power architecture. And it's not really called guest, but it's called an LPAR logical partition. So that's the known of used by IBM. It's a little bit different than what we know from KVM, because KVM is a hypervisor which does everything. PowerVM does just the virtualization. For example, storage handling or virtualized storage handling is done by the. I think so, yes. It's basically borrowed from them, yes. The PowerVM basically is a very small hypervisor. It's always running on such an enterprise IBM servers. You can mix, for example, big engine, little engine. If you are forced to run AIX, for example, together with Linux big engine and little engine, you can do so. One advantage of the power technology is the so-called SMT, so simultaneous multi-threading. I mentioned it here, especially because I have some numbers about it, what it brings to you. It's basically the same Intel does with hyperthreading, but IBM has put big effort into it to get more parallelization out of one core. So for example, one Power8 core has dedicated resources for up to four SMT threads. They can access, for example, registers or memory or CPU caches without interlocking. The PowerCPU can do SMT1, 2, 4, and with Power8 SMT8. OK, let's some numbers. Let's dive in. I've used the PostgreSQL internal benchmark, PGBench. It's also used by the hackers to prove performance improvements or performance patches in the project. I have slightly tuned the PostgreSQL configuration, mainly the biggest or most important ones, like shared buffers, MUX, and main wall sizes, and so on. TPCB is basically the known use by PGBench, but it's just a transactional throughput test. It has nothing really to do with what an application does in reality, but I think to get some idea of what we can expect from the Power Platform, that's the right tool. Each test database was 15 gigabytes. It was a scale factor of 1,000 in PGBench. I mainly used prepared statements. I will tell you where the numbers are, where I use a different method. And every run of PGBench was previously done with PGPreWalm to get the caches of the operating system on the database warmed up. If you have any questions, just raise your hands. Because most of the numbers you see, for example, on the hackers list are usually done with 100 or 1,000. And it fits completely into the shared buffer pool. That's another. I've tried bigger ones. The problem was that because I wanted to use huge pages on the Power Machine, but Power VM didn't let me choose more than 16 gigabytes to reserve huge pages on the bigger one. I've talked with the IBM guys in E-Hingen in Germany, where the machine was located. But I didn't know either what's the reason for this limitation. But there is actually a limitation in Power VM. So to get all of the numbers in common, I decided to go with that number. And you want to be memory central just to scale the machine itself and not rely on storage or something like that. Yeah, yeah. Those numbers I present here are basically just to get an idea about the scalability of the CPU itself. It's basically CPU and in-memory performance. That's where power and inter-architecture are different, right? Storage is probably the same as attached to. So you don't have to benchmark the storage. You have much bigger caches on a Power 8 CPU. And for example, on common Intel CPUs. But the main difference is the memory speed. For example, open power can up to 230 gigabytes per second, if I recall it correctly. If you choose to correct memory, the bigger enterprise machines can even do more than 200 gigabytes per second. I think they can do 320 gigabytes per second. That depends on the hardware you purchase. You can go back a little bit. The reason why we have 512 gigabytes here in that machine is because the central memory system used on that tie-in board, it's a board actually purchased from tie-in, can do the full memory speed only if you have put all banks full. If you don't put all memory banks full, you have lesser memory speed. That's the reason why we have used your 512 gigaliter in here. OK, let's start with some numbers. That's a comparison graph. The E850 machine, the logical partition with Ubuntu LTS compared to open power. Again, it's bare metal. We can see that the open power machine is slightly faster even than the E850. This is one socket only. So the L-PAR was restricted to run on one socket with eight cores. So it's comparable to the open power machine. And we see it's slightly faster, the open power machine. That's not surprising because it's nearly the same CPU. The open power machine is even slightly a little bit slower. But it's bare metal and nothing is virtualized on that machine. The E850 L-PAR is also using VOS, so storage virtualization, which might give here some influence. Not sure about that. We have to try it without a storage virtualization here but what was not possible on that machine. So we can do up to 450,000 with 100 connections and 360,000 with the E850 machine with eight cores. To give roughly an estimation where we are here compared to an Intel CPU, I've had some graphs from a customer project from November with an Intel machine. It's using two E2650 10-core CPUs on that machine. And I've decided to put that number in there. The problem with that benchmark was that I only have three different client connections tested with that. But I think that's still enough to see that the... I have to look around the corner. The E850 platform is performing a little bit better, starting with 32 connections than the Intel platform. And interestingly, with 32 clients, if you test that against 28 cores on the E850 machine, it's even faster. I have to say that PGBench is running on the same machine with Unix domain sockets. The reason for that is that to be comparable with the open power, I need to use that because I don't have a second machine for the open power machine to test against it. But the Intel machine is always a little bit slower. Just to get an idea of it, you have 20 physical cores against eight cores and against 28 cores. Any questions? We're also running on the same cores. Yes, it's always. That might explain the swarm, that we have 64 logical CPUs, and it's just decreasing a little bit before we reach that. The problem is that you have many 64 threads and 64 Postgres pagans. What do you mean with sucking out? Well, if you have eight cores dedicated to Postgres, there would be equally close connection between PGBench and the same cores. How many cores does PGBench need? It was always 1 to 1. If you have 100 database connection, I have used 100 PGBench threads. Yeah, I mean, you have a user. And if you run on full speed. Well, the interesting part is, just give me a few minutes because I have another slide, which might be interesting for you then. OK, that's some detail. OK, I have to get back to the red line. Well, you can see basically that Postgres is scaling very well, about 28 scores. The intermission isn't scaling linearly, but much slower than the power architecture here, for example, just to get an idea. Here are some details against the eight core and 28 core. Why 28 cores or CPUs on that LPA? The problem is that, and that was told by the IBM guys from A&N, that the hypervisor and the Vios on the IBM machines use itself CPU cores. So it's a bad idea to use an LPA which spread up across all cores. You have to reserve some for the hypervisor and for the Vios server. That's why we have reserved four cores for them and 28 cores for the Ubuntu LPA. And that's why we have 224 logical CPUs here. So we can do up to 600,000 transactions per second with 28 CPUs. That seems fairly good, I think. I've also did some comparison between Power 8 SMT modes and Postgres code because I was interested how big the influence of SMT is to postgres code performance. I've decided to go just with SMT-1 at the start. SMT-1 means basically SMT-off. So you have only eight CPUs available. And the virtualization guest peaks out at 150,000. SMT-2, it doesn't double. But it effectively gets about 230,000. And with SMT-4, you get above 300,000. SMT-8 doesn't give you that much more as you might expect. The reason is that the Power 8 CPU has dedicated resources for up to four execution threads. And with eight execution threads, they are just divided. That's the reason why SMT-8 gives just a slight improvement. I think Power 9 will go back to SMT-4, basically. I'm not sure about, because Power 9 will come with different architecture designs, I think. There might be some Power 9 CPUs with SMT-8, but I think they'll go back to the Power 7 architecture which just provided SMT-4. That's actually my information what I have. I also did some comparison with older post-crest releases available in PG app, basically 9.3, 9.4, 9.5, and 9.6. The numbers for 9.6 and 9.5 are basically identical, because you have some noise from PG benches and so on. But the curves are nearly identical, so I don't think there's much difference. The biggest difference, of course, is against 9.3. That might depend on the scalability patches which went in since then. But it's remarkable that you get nearly 100,000 transactions more than with 9.6 than with 9.3 or 80,000. 9.3 has another implication here. It couldn't use huge pages. It's not there yet, or it wasn't there yet. The other, the 9.4, 9.5, and 9.6 instances all used huge pages. I did some, I don't have a slide for it, but I did some comparison. Oh, I have a slide for it, because the next one, and that might answer the question. This performance test, because I thought that when running the benchmarks, the scalability benchmarks, I thought that the machine wasn't really setarized. So even if you run PG bench on the same platform or the same host, it didn't really setarize the whole machine, especially if you get into a configuration with 28 cores or just sitting down. And well, it does did something. But the CPU seturation wasn't over 60%. But Postgres couldn't do more, actually. And that's actually what that figure says. With one instance, with four sockets, so 28 cores, we are saturating nearly 450,000 transactions. If I start a second instance with this Postgres 9.6 installation, so complete another port, another instance, and run PG bench at the same time against that machine, it gets the same with both numbers concurrently. If I start a third instance, I get some lower numbers, nearly 350,000. And then the system suddenly starts to saturate. And with four instances, you see some fluctuation in the numbers. And that's an indication that the system is nearly saturated. And it was really, we had 100% CPU time used. And with four instances, the machine did two million to contact switches per second. So I think in Alexander, I couldn't do that because I didn't have the machine available yet. Then yeah, that's a pity. I really would like to do that. Yeah, yes? That was really an interesting picture, I think. The problem with that machine, well, it went into production because the machine was tested at the customer side. And they wanted to complete their project. But yeah, that would be interesting to see that. This was done with 15 gigabyte share buffer pool each without huge pages. Because it's always the same like the number of connections. So minus C100, minus J100. I did linearly. I think my colleague, Julian Scholder, he did some interesting tests with different notions. He, for example, using 128 database connection and just 16 PG bench threads that give slightly different numbers. I think the problem is all the outcome of his findings was that you always have to use a power of two in PG bench because otherwise you have really strange results then. But PG bench is a beast. I also did some ride scaling. The ride scaling was only done with one socket L power with eight cores with PG bench. You can do up to 32,000 ride transactions per second on that instance. But you have to tune it a little bit. So the blue curve was actually max wire size 20 gigabytes. And the red curve is with max wire size. So the size of the transaction log 120 gigabytes. So checkpoints, player role here. My colleague, Julian, also repeated the tests with aggressive wall writer settings, but it didn't give anything more. The system was still not saturated here. Just sitting around and doing all the stuff, but the CPU time was plenty still available. My other colleague, Christoph Berg, who is one of the maintainer of the PG app project, then asked me, well, what about power special optimization for post-credits called binaries? You can do some power eight optimization with GCC. EBM itself provides the advanced tool chain, the customized GCC build environment with a special or newer version of GCC on Ubuntu. PG apps normally builds with the standard distribution compiler in Ubuntu. It's GCC 5.4 with IBM advanced tool chain. You have GCC 6.3. And what they also ship is a specialized or an optimized crypto library. For example, you have a special optimized open SSI library in there and so on. And the compiler per default enables O3 optimization level instead of O2, what PostgreSQL usually does. And it's the MCPU and M-tune options for power eight, which activates some power eight tweaks, like Altywake options and so on. So he asked me to compare that. Maybe it's worth for PG app to get also some optimization in their build environment. The outcome was actually for that benchmark run. That's PG bench without SSL or other things. It's just plain transaction throughput. It doesn't matter. If you use a plain GCC or an IBM tool chain, advanced tool chain, you don't get any difference. That might depend, of course, on the workload. If you're using SSL connection, for example, you might get benefit from it. It's for transaction throughput tests. Well, you are saturating the data. The database drops out at nearly 150,000 transactions per second if you use SSL, for example. SSL has a huge impact on the CPU. So it might be worth to do that with an SSL benchmark. So what about AIX? We have plenty of people, well, I know plenty of people which are still using AIX on the Power Platform. I've did some comparison with AIX71. It's basically the same environment, the same storage, on the E850 machine than the Linux tests. The problem with AIX and even the experts at IBM had a hard time to figure out to get high transaction rates out from Postgres on that platform. It's really difficult to tune. So here's a small excerpt what we have tried. First, I'd all use GCC. If you are wanting to use a Postgres on the AIX, XLZ doesn't like Postgres at the moment. I think there are patches around. From Konstantin Krznyk, he reported that on hackers that there are some lab logs with the XLSE compiler and provided some patches. I haven't tried it, unfortunately. You have to make sure to build in object mode 64. So you get 64-bit XCov binaries. The XCov binaries allow you to tune some tweaks into your binary after compilation that's done with the AliEdit command. I was advised to use the fork policy core. I'm really not an AIX expert, but I believe that guy. And that's especially important for PgBench. You have to export matter options with AIX. Because otherwise, when you use PgBench threads, AIX always uses one dedicated heap for the threads. So every thread which is modifying, the heap has to wait for each other. The default setting is 32, for example. So you are safe. But if you make it too large, you have a problem that the maintenance of all thread heaps is costly. So you have to get some tweaks to find the right balance. That's what we have done with recommendation from our AIX kernel guy in the USA. And the schedule command to force some special scheduling policy in the AIX kernel. For the bottom line is it's much easier for little NNM Postgres. But if you still want to use AIX, it's better to use packages like Michael Perzl from Germany has a special AIX repository available where all the Postgres stuff is put into RPMs. And you can easily install it with YAM, for example, in AIX. So long story short outcome. That's a comparison between the E850 core Alpar instance AIX versus Linux. Again, same benchmark, scale factor 1,000. Postgres configuration was completely the same, except that AIX didn't use huge pages there. Yeah, I think that shows everything. And that's even a tweaked binary. So it drops out nearly 200,000 transactions compared to 350,000 transactions per second. But again, the system is not centralized, even under AIX. So I have no idea what it is actually. A kernel guy had a look on some traces on AIX. And he suspected that he has something to do with inter-process communication. I think there was also a threat on hackers telling that the M-map implementation on the AIX is suboptimal. Because every time you do an M-map on an M-map region, which is our share buffer pool, basically, you have to put a log on it. I have done some tweaks to relax that a little bit. That's the legend says all about it. It's the red alias option. It was set to 200 in the AIX kernel. But it didn't give that much benefit. It's a slightly plus compared to the other results. But the problem is it lies all between 10%. And everything between 10% is, for me, like, peachy bench noise a little bit. So we have to take that carefully. So bottom line, actually, from my findings is, yes? Well, I have to talk with that guy for that. This is just interesting information here, right? Again, it doesn't know about those benchmark. And they do know that we are presenting them here. They know that, of course. One of those guys told me that the set of one guy in the zone on the AIX is low on the project itself. Yeah, that's actually my outcome, too, so my opinion. With PVD64, Little Indian, you get basically everything from the open source world. And you are open with that architecture. I think what I am, power is, I think, more prominent in Germany than actually in other countries. I know many companies, for example, are dedicated to SAP HANA, for example. And they run on Big Indian, but they also want to use other open source project for their IT infrastructure. And with that, China was, I've heard an IBM guy told me that China was one of the first countries which purchased a power license. And the other thing, maybe it should, before we finish, I guess we are running out of time, right? We have five minutes. The way I understand it is, if I put kind of the same money on the table for an open power system and for an Intel system, I get more performance out of the open power system, right? If you look into where's the number, you get with, I don't have the open power machine here because I didn't have the possibility to do that test again with the open power machine. The open power machine was traveling through Germany to different customer sites. So I have only small time slots to do my benchmark there. There are not that many machines available right now. Where is the? That was too fast. So it's with just eight physical cores, it's much faster. You get nearly 400,000 transactions per second. But this is comparing to? Yeah, I'm just interested in the blue line. And with Intel, you get with 32, nearly 150,000, 64, 200,000. So it's lower, yes, in that comparison. With that interzip, you get more. Yes? It just needs more work in the machine, right? It depends. If you're using PowerVM, for example, on a big enterprise machine, it's getting more difficult. But I think these smaller open power machines you get from IBM are fairly easy to use. I have talked with some guys from Thomas Kren, which are implementing that machine. And he says it's not that big of a problem. So essentially, those open power machines run on Linux, right? Yeah. Because we're using it so that the people a few years ago, it didn't think, like? On what machine? Well, it wasn't some power machine. We have customers still using the old Power7 machines. And I didn't hear anything about that. It's really hard to get it running. Maybe his contact was one of the first to try Linux on open power. Maybe that might be possible, yes. I have compiled it on my federal workstation at work. It took three hours, and it worked. So the Power9 CPU, for example, gets a hardware compression, which Postgres might benefit to with the LZ4. Because of the expectation of all the questions? Yeah, thank you very much.