 in počkega, da ne možete počkega vzveči projekte, bo je zelo zelo vzvečen, ali je to zelo vzvečen. Vzvečen sem da tudi, kako je zelo vzvečen in kako se zelo počkega. In da sem da tega vzvečen o političnih vsev. Zato sem počkega vzvečen ...me so pobudili kaj je, ko bomo stavili otov o flood... ...zreklj je do Vimy... ...Wy boj znelo po uni, ki je bora w moj menega. ...isne o političnih vsev udrajujem... ...zato sem bomo pravno počal. Svej je vzpeč poškaj še je... ...znavalo je jem. ...zato ne boš bil s tem menej še do... ...znovalo je... ...zrata boš poškaj. ...jemanj me ložilo na dvega. So essentially we are the guys that make that nice colored maps. Let me go back. This kind of map. Here it doesn't look too good because of the projector, this is the map of Europe, this is a project called Share a project that was done at the EtH institute in Zurich. Enovalo je to, da je prv nekaj projekt, ki je popen Kwejk engine, ki je, da je nekaj nekaj projekti, ki se zelo, je zelo, da je zelo, da je zelo, da je zelo, da je naprej. In kaj je, da je nekaj projekti, kot nekaj, nekaj 4 roj, 2012, to je vse zelo. Zelo, da je zelo, da je zelo, da je zelo, projekta, da je zelo, in vzelo to, ki je vzelo vzelo vzelo več vzelo, bo je kaj je 150,000 pomeče in vzelo vzelo, vseč, vzelo v zelo. Vzelo vzelo, da je zelo vzelo na zelo. Zato ne zelo vzelo projekte. Zelo vzelo in sem musela vzelo vzelo vzelo. Ne, včešel sem tukaj prezentacij, da sem tukaj, tako da sem tukaj fizicist, da sem od akademia. Včešel sem tukaj, da sem tukaj, da sem tukaj, in tukaj, da sem tukaj, da sem tukaj, da sem tukaj, da sem tukaj, da sem tukaj, Dev mighty were there, maybe you read some of my papers? If you have studied pollinMark e.g. everything in Python, I wrote an article about C3 method resolution order, and there are other series of these metaclases in . This was 12 years ago when I started on Python. Because at the time I was Still a physicist Python was my hobby, so I had the time to write and write and began to work. So my contributions through Python became a little bit less. I'm also the author of the decorator module that probably you are using if you don't know it, because it is a dependency of Scipy. I think it is also a dependency of Python. It is a dependency of several web frameworks. I remember Pylon, Scipy dependency. Even frameworks that I discovered. Vsah vse bolo radi, kdaj ker se bodo tražite in v teh nekaj če sem tražite. Spremljali sličko z klahov, ki ju tudi spremljano. Počko jih vse bogamo, da si je napravila klahoma, s kojom izmely tudi je tudi s držah, ki je tudi tako ta delal. Stajte, da ste jaz nalej col vunalar, da je veliko začel. Kaj? Zelo, zelo, ja izgledam včas, da je bilo začet, tako da je bilo, da je bilo začet, kar je zelo, da ne ko se povedim. In pa vse mnoh delo se povijeli na finacij. Zelo smo srednje režime, odpočenje, režima finančne, in ne predal sem režim, odpočenjenje, režima včas. Apočenje, tkaj terminologi je veliko vpračena, hopšon, stopšon, now the assets are buildings and we have to compute the damages, something disastrous happens and it's actually very similar. So we had an engine, a qualcualional engine, now we have an engine for the air pump, so it's not so different at the end. And I arrived at Jam, as is written there in October, and after a while I became the in vsega inžina, ki je simulacija inžina, komputacija inžina, ki je vsega morda, sesmeka morda, ki se znači, vsega morda. Zdaj imam tvoj prezentacij, ker sem vsega morda, Gaelo Varoqo vsega morda The g welcome to the walk, the scientists, one side and the web programers generally the programs of the other side. And I met transition in with these two words because I came from physics, I spent 10 years doing web drop in database development, this kind of stuff. Zato, da zelo, načo je tukaj vse z늘aj, je tukaj, še je vsevega, in vsece vsece, pri objevnih razvržaj, zelo, da na našem koncernu tukaj se zelo, pa zelo, da se zelo, da imelem zelo, da sem tukaj o kod, zelo, da se to vseče objevne, komkurrenci, toga zelo, neko mozi v angelo, ko so všišli programi, pa zelo picsmu, na zašli, z programi. Ovalujem všečko, ko do naša database, ko je skupnina, ko je prekazma. Naša bi bil vse neposledna voja. Srečo, ko sem coh dala za 3-4 rovku, je pripravljam database, in odenačaj. Popravlje tudi. Vse sem tudi zelo skupnico, da sem počkala, da je bilo, da so počkala, in da so vse vse. Zelo sem vse na vseo skupnico, da sem je vse na vse. Ne glasbeno svojo, da sem je glasbeno vse, da sem je glasbeno vse. Pa je bilo, da sem se učnila, da sem je izproma, da sem se vse izproma vse. verjim z vsem, da so vse ležite, da je to vse vse pravno. Selo nافiši je, da so vse vse na vse pravne. Selo, da sem ne boš načinati od vse ko je, je to vse prišel. Tudi nekako se počnem, da so prišel, da je so počnem, in da so prišel in počnem počnem počnem, da se počnem počnem, da se počnem, da je vse počnem, da ne boš počnem, da se počnem. odličili, ki so bostali nižne covorile, bo vse me preventovanje naredal. Zelo se nedi začal neskone in počlj ne vsahenjo vosi. Torka ne boste bilo sponjenje. Znaleba jaz da n beachovamo, nismo pa ne všim začali, da smo počkali, da je napotrti nekako. Taj bilo napotrti, kako ta hrabila. dar tudi, načal neli smo načal, načal, jo, in ja bilo posled sounds, in 누j were discussing with the guys there, they told me that, A la, do you know, we used the posgres to store our floating point numbers, essentially bigger rates and I was very surprised and was saying, never have anybody doing serious numerical urlations with storing everything in posgres. Bo, I thought, maybe the base In kot se prišli, da je zelo, da je odbijan, začal je zelo, da je če pa bilo, možemo prišlo, da je vse je tukaj različno. Tako zelo, da mi je različno, da je zelo, da je vse je vse površil, zelo, da je vse programer, vse je vse nekaj žen. Vse se je vse vse vse različno. I bo ime tega, tega nekaj pridon, in izvahovati v KVD, vse zelo, da se je vse zelo v zelo. Tako je objev, da je začala, da je začala. Tako je začala. Način je to je, da je začala, da je začala. Tu je bilo svega vsega, da je začala zelo. Zelo je napravila, da je začala. Tukaj je to začala. Tukaj je začala. Tukaj je začala. da pokazati v mojj prisih začuteno tudi, da trgersi v kompetijciju, zelo se na zelo respiracijo glasne. Zelo glasne začutno dakle, in da psih domali v vsoj stahlja, da akdu še sedaj do tudi do tudi do vzela. Pajte tudi vo vzeli,headnjev na povrk. Prepovrk da se razpronujeva, vse začutno spojrega. To je vse beto, kar sorbi, da bi tudi 500 vzeli, To si werjimo ga najnogčje na razdaj. Na svetljenje, smo svoje, Izgleda na izgledem, to je včasno drugo dolušanje. The worst of it was, there was no way to the couple the data base logic with the scientific logic because everything was in terms of jungle models. Then, what you do in this situation? To je bilo, da je to jedno. Tezaj, da je vse predrečaj, da je vse bolj. Kodbe izgleda, da je vse na vse bojo vse, tesh, dokti, kot tako površt, nekaj bilo, nekaj bilo, nekaj, in tako površt, ali haberi je oči, da je bilo in pomembno, da se začala, da je bilo vse pomembno, da je začala. da počkaj ne bomo različiti svoj, zelo privak je načinjati však. kicka, kar bi stvari na sklečke, skonča, različi, ne bomo obostali teve, ko sem dobel. Ja ne boš počkaj, oč settings, kao način ti doba... ...zo vse nas však češel, je prikratilo. Maši sem vse počkaj inivo, da jih bil sem bila inom admitteda. kako je niske ili to.試 je posleda, če je igrač, če tehere je nesmene, da je je izgleda, počasimo, da je to, zelo jehotno, nekaj to predrečit v programer X, je jedan počasit, nekaj nema jezne, bo to je postavilo sederi, prihleda, bo je jasno groundwater sredne, da je predrečit. Da, sem razstavljal, da je izgleda občas, bo po odličenju, da sem zelo odličena, da sem občas način, da sem zelo odličena, da sem zelo odličena, da sem zelo odličena, da sem zelo odličena, da sem zelo odličena, da sem zelo odličena. Način smo na dela v vseya bolj, ki je zelo odličen, Včasno kveče zelo je vrat kvača zelo. Na Ad a kaj te zelo izberje kanel, kako je zelo, boš boš je včasno kveče zelo, tukaj ne ne boš v ročnega. Včasno kveče zači včasno kveče zelo, potukaj ne včasno ne bude. Pomねča? Tukaj, tako, to je nekaj bolj sprem, in nekaj 4,000 sekund, in je je 40,000 sekund. To je se, da sem plnila na meja stavne. Vse vse skupne vlastno je nešto lepo. Masa, da je to lepo, da kompetitiv nješta raztuk nešta, zelo ne začal tudi otev na database. Danes je to nešto lepo, because actually, then you have to read the data from the database and the real problem was reading the database, because we ran out of memory. third time we tried this computation, the cluster, we run out of memory, the cluster was 128 gigabyte times four machines, so alpha terabyte of memory, it wasn't enough, and a later estimated that I would have taken more than two terabytes of memory to ramlet, so it was totally, no way, okay? So, there were all the mangaa no option, or you don't save the gram motion field or save them in a more efficient way You can see here what is missing. You may think there is an error here. This is only one column. This second column here is missing. Actually, it is not an error. So this is the release engine. We have the number of the RELEASER software. software, engine. This was released 1.5, and this was using the database, so it took, I don't know, 40,000 cumulative seconds. Cumulative means summing the time and all the workers. It took 40,000 seconds, now with the release 2.0, it takes less than half a second. So this actually is there, but you cannot see it because it is half a second. So there are five order of magnitudes, or different, depending if you use the database, Postgres, or if you use HDF5. So five order of magnitudes. And actually must say that this is engine 1.5, but when I arrived it was engine 1.0, it was before engine 1.0, and the situation was much worse. Because things I did was, for instance, to improve the Postgres query, so I did the bulk insert instead of insert many. I tried all the things like removing the indices. I tried all the stuff that you can do with database. So this is already very much optimized. But if you remove the database, you can gain five order of magnitudes. And five order of magnitudes means that one day of computation becomes one second. One year of computation becomes five minutes. So it means you can do a computation, otherwise it was impossible to do. And I measured these kinds of speedups, this in the Gramošian fields in another part of the code. And when I measured this speedup, I thought, no, I made a mistake. It's impossible. But I measured it again, and there was no mistake. So this second case was there were a thousand queries, because, OK, Django was using the relational map, so instead of do a big query, do a lot of them, anyway. These kind of measurements are real, and actually there could be even bigger cases, but I couldn't measure because you get tired after waiting two or three days, you get tired, you stop. But the larger the computation, actually the larger the cluster, the worse the situation was. Also, if your database was mostly empty, OK, there is some performance. If the database is nearly full, so we had one terabyte of this space, then a performance much worse, of course, because you have to insert inside the database, which is already full, so. And we also had memory problems, et cetera, et cetera, et cetera. So I didn't expect these kind of incredible results at the beginning, because I knew about HDF-5, because the scientists told me, actually, and I did some experiments, and I saw that HDF-5 was at least 100 times faster, but I would not have respected 100,000 times. The reason is that when you do the small experiments, small DB is one thing, but in the large case it is much worse, so the speedup is much better. I wanted to remove this stuff, but you cannot sometimes. I went to my boss and said, look, there are these problems. He told me, look, I believe you. I'm totally convinced that you are right, but the architecture, this application has just been rewritten, OK, they spent one year. I cannot tell the scientists, oh, now we have to rewrite again, OK, because we have to release in six months. And actually, there is another interesting thing that a younger colleague of mine came a few months before me, did an analysis of the code, he told the same thing to the boss and sent an email, but this email, by mistake, went in the hand of the developers that brought this architecture, and of course they were not happy at all about that mail, so there was friction, and also the split, sorry, the teams were split because we had half of the team in Zurich and half of the team in Paria, I was in Paria, and essentially the team in Zurich did the other part, the other part means probability of having an earthquake, the risk part means economical damage, and we were in charge of the risk part, but it is very difficult to do an efficient calculation of the risk if the hazard comes to you in a format which is the not the right one, OK, because if you have to query tables, they are not structurally well for the risk, they maybe are structurally well for the hazard, but not for the risk, and the teams don't talk with each other, or there are issues, et cetera, that I don't want to go into, but you can imagine. So what I did, I did nothing for eight months, nothing except study, code base, learn what was there, maintain, fix the bugs, let the frustration grow, which I think is a positive thing, because the more you are frustrated, the more you want to change things, at a certain moment, if limited in time, at a certain moment you are motivated, and things can get done, so a bit of frustration is also about the end. We had the test, we improved the section which were not the real problem, but for instance, the first thing I did there was to put in place a monitoring system, so while the system was running, I could measure how much time it was taking to run queries to do stuff, OK? Even that was not so easy to... Maybe after the talk if you won't ask me, I can tell you some story, and I wrote the XML parsing because the seismic models are in XML format, it was done badly, the concurrency I unified it using my MapReduce everywhere because it was done in strange ways. Anyway, there was a lot done in these eight months, but not the real problem. Fortunately, the Zurich team evaporated and I don't mean this in a bad way, nobody was fired, nobody was fired. But people left. It was made clear that the company wanted to move most of the development to Pavia and not to Zurich. So somebody left, found a better job. Somebody... We also did the contract with the DHEH at the time limit, at the end of the limit, we said, OK, if you want, you can come. We have to work with us, but in Pavia. And the guy said, I have a girlfriend here. So, you know, the team evaporated and so we took charge, both of us are at risk. And we started to remove stuff. More than 10,000 are not caused. There was a lot of stuff, which was really not used at all. OK, we removed, and we decided what to do. We, I mean, essentially me and my colleague, Luigi, the one who wrote the letter that I thought before. And we decided, OK, let's keep POSGRES because we cannot throw everything away. And we changed a lot to the database. So implemented the migration mechanism, which was missing. So alter table, change structure, change queries. Lot, lot of work. 30 months, more than one year spent fighting with POSGRES. And we had something like one order of magnitude improvement. So we had a lot of improvements, but not five order of magnitude. OK. But anyway, in the meanwhile, we had released, the release 1.0 went as a schedule. We had released, we had users. And by the way, I'm not saying that everything was wrong, broken because some calculators, like the one, I showed the map of Europe, that one didn't have big performance issues. So that one worked. It was the other one that had big problems. Anyway, we could maintain this software, this code working. Finally, at the end, this was in September, so nearly two years ago, these more measurements, and they realized that it was, I could have improved a bit more on the POSGRES side, but it was not worth the effort. I can say, OK, I can improve, maybe a factor 2, 3, 5, but I need a factor of 10,000 to make this computation possible. So at the time, I knew that HDF5 was much better, because we don't use transactions, OK. The architecture was wrong because we didn't need the database at all. We didn't need to do any queries. We had some just partial queries, which were extremely slow. We didn't need to do these queries, OK. So it was totally useless at the time. So I decided, let's try to remove that, and we've discused the Windows porting, because of course we wanted to make, because we are doing open source software, the software called Open Quake Engine. Open, so it means we want, everything is in GitHub, you code reviews, everything is public, and we want this software to run on any platform, so, lady, any scientist in the world, seismologist, can allow this to run on his laptop, not only on a cluster. So if you want to run also on a Windows cluster, a laptop, maybe you don't want to have a scientist to install Postgres and fight with the configuration files, things like that. So I said, let's try to write a light version, OK. So we keep the monster as it is, but we write a small version that doesn't depend on the database. I say excuse, because for me it was clear from the beginning that at some moment the toy model, the light version would have been replaced completely on the monster. It was divided by the beginning. So anyway, I did that. It was a lot of effort, because I needed to duplicate, essentially, old calculators, new calculator, make sure that they give the same numbers around the same test. Fortunately we have a lot of functional tests. But I learned stuff, we should have fun. I made a lot of changes. So here are some of them. So now I plan to leave time for questions. So I will just leave you read these slides. And then if you have a question on these points, please ask me in 5 minutes when I finish. So there were a lot of changes. But the end story is that there is no Postgres anymore now. There is a SQL light database. The SQL light database is used to store metadata information. For instance, I have a calculation. There is a job ID. Start time, stop time, description. Then there is a table with the logs of that calculation. There is a table with the output of the calculation. And the table with the performance of the calculation. That's it. All the scientific data, all the arrays are inside the HDF file. If you don't know what HDF file is, please study it, because it is really impressive software, extremely easy to use, and extremely performant. These are some of the things I did. Again, ask me after, if you are interested in this type, this kind of stuff. And a lot, a lot of stuff. I did the port in Python 3, the port into Windows, serialization to HDF file. Lot of work on XML, because in the beginning we had the XML schema, which was thought another completely wrong idea. Lot, lot of stuff. And I am surprised, because at the end it seems to be working, but I am still surprised, because with such an enormous change, it went surprisingly well. I don't know, I don't know. Anyway, it went well in the sense that now we actually have a performance, which is really a thousand of times better than before. We have more tests than before. The scientists are happy. They can run it on their laptop, it works on Windows, it works on macOS, it works on Linux, it works everywhere, even on the Raspberry Pi, it works, and works. So, I want to tell something about what I learned that may be interesting for you. This is, I already knew, but you see, after this work, I believe in these things, even more. It is really essential, monitoring the system, the first thing we should do. We had the problem with the unit test, because there was database, in the test you had to create fake tables, populate rows, delete rows, do all this kind of stuff, and the test took, I don't know, two hours to run, and now they are running in seconds, because if you have an DH5, it's really easy. Also, most of these unit tests I removed, the original one, because they were testing details on the implementation, and if I changed the implementation, I was forced to change the test, and I tried to replace them on functional tests. Functional tests, instead, they are what really the scientists want, because the scientists tell me, look with this model, and they compare the numbers. That's what the scientists want. They don't want to test the implementation details. OK, learn this kind of stuff. So also, I am old enough now, so I am some white hair, so I feel that I can give some technical advice to people, and my technical advice is, first, do things simple. Don't care too much about performance. Two things, simple. When they are simple, 95% of the time are already faster without doing anything. Just removing craft. And also, don't spend time in complicating your technological stack, because I got requests from people telling me, why don't you try to use numbers, why don't you try to use graphical processing unit, why don't you try to compile with the inter compiler, which are all good things, but my problem was the database. First, I have to remove the database. Then I can't think of this. I cannot complicate the stuff. I first simplify. And also always change the assumptions. For instance, we have this very complex geospatial queries, so we are using postage, and at the end, do we really need these geospatial queries? And the answer was no, we didn't need it. So I removed totally geospatial queries, and everything was better than before. So challenge assumptions, just because some code is there, maybe should not have been there. And so my advice is take the most difficult problem that you can solve. Here is, this is in Italic. So it means if you cannot solve the problem, wait a bit. Other plan B. EF, because of course the big problem is not easy, but keep in mind what the big problem is, and everything that you do is with the goal of solving the big problem. In this case was removing postage as the big problem. So I couldn't remove postage for eight months for political issues, but I kept thinking about that. And be patient. Also some political advice. Yeah, with the boss. You know, when you are in new B and you come, you say I would like to change everything with the boss correctly, I think, will tell you wait a bit. He is right, so it's not a problem. Take the slow way. And every time you do a small change make sure that the scientists, the users get a major advantage. So you can tell, I did change this change and now this is 50% faster and the user will be happy. So the next time you propose, I would like this change, they will believe you because they knew from experience that what you did before was okay. So they will believe you. So take time to build trust because when you arrive, even if you are, I mean, like me over 40, well experienced programmer, when you are in new place, you are in new B, so you need to take your time. Doesn't mean that you must say yes but you can raise your voice. You shouldn't be a yes man, okay? You can raise your voice. Sometimes there are technical issues that you can discuss and reach an agreement. Sometimes you don't reach an agreement because simply you say they are not technical, they are, for instance, one thing, I want a single repository and not three repositories. For me it was a big issue for the boss. We had a single repository, understand. By the end now we have a compromise. So from four repositories now we have only two. Sorry, not a single, but see. Sometimes you can fight a bit. And I discovered several things that essentially here did some experiment retrieving from the database floating point numbers. So a floating point number for 32 bits, for bytes. Using Psycho-PG took something like 30 bytes. Because you know that the Python float is already 24 bytes. Plus add the layer of Psycho-PG more bytes, add Django, more bytes. So four bytes can become 50 bytes. So it can be in order of magnitude. So this was surprising that the effect was too big. Also the experience they had that sometimes you make a design because you don't want to memory. And you do that. It doesn't run out of memory. But maybe it takes one week. So your computation runs for one week. You don't get any feedback. Maybe at the end finally you run out of memory. Or even you don't run out of memory but never ends. The scientist comes and says why this is stopped? So it is best actually to let it fail. It runs out of memory after after half an hour. Maybe you put the wrong parameter. So they know after half an hour that they put the wrong parameter. That was the reason. Or sometimes it is better to have design actually where you use more memory. Maybe it fails but it fails early. And then you can take action and decide what to change. Also it is more efficient if you try to do everything memory. The other story that I couldn't tell. These are let's say two minutes. Two minutes I could finish. What I did for the concurrency, I decided I like the concurrent futures. So I changed everything because we had some reinvented concurrency in these strange ways. We had at least three ways of doing concurrency. So I removed all. And then I used the interface of concurrent futures. And then I write my map of Redruss on top of that. And I have a plagable system. So if I am on a laptop, I use concurrent futures with multiprocessing essentially. If I am on a cluster, we use salary. Everything was tested. The Python 3 part was tested in Travis Wells. For the Python 2 we were using Jenkins. And it's good. Wheels are really, really great because we have this problem that you saw the keynote this morning. These wheels, especially the many Linux wheels which are very recent six months ago. With those you can distribute your code essentially in all platform without problem. H5PY has some bugs, strange behaviors can give you a segmentation fault. So I recommend it but pay attention. I have some regrets too because essentially for two years we spent time trying to optimize Postgres where there was a battle against windmills. There was no opportunity to win that battle. I should have removed more stuff. Especially the test because I am old, so I am conservative. If I see some test, I think this test is important. If they break, I fix them. If they break again, I fix them. If they break again, I fix them for the three times. After four times that they see that these tests are actually in my way and I don't need those and it's better to throw them away and replace them with something else. After four times I do that. I should probably, if I did it the first time, I would spend it. And also spend sometimes time in the port features that after discussing with the scientists I discovered they were not not even intended features. The scientists said, ah, but the engine is doing that. I didn't know about it. And they were feature well tested etc. So I expected to be very important. So these are regrets. Still I am very proud of this slide. Very, very proud of this slide. Because our code base is split as they said in two. One part is the other lib which is the low level libraries which is still in Python but it's low level. It's the part that the scientists work on. And unfortunately I don't see here but there are numbers here you don't see the epsilon but anyway this is the release of the engine and the number of lines of code this is something like for 30,000 to 70,000 line of codes. So in three years because this is three years from release one zero to release two zero three years we more than double the number of lines of code and it's fine because the scientists added more models, more seismic models. But if you look at the size of the engine which is my part let's say I reduced the size from 55,000 to I don't know 40,000 so I was able to after three years of working there there is now less code than before. I am very proud of that. You know that in the intermediate release there were duplication lots of ugly stuff so there was more code but finally I removed all the stuff and now there is no more the old engine 95% now the code has been rewritten. And so let's say that I'm happy if it took nearly two years and I am very proud of that. Thank you