 Thank you for being there. Ok, this is a story about love So, I think that it's the moment to talk about love. What is the meaning of loving what you do? It's not a rhetorical question. I want an answer from you. I really want an answer. What is the meaning for you of doing what you love? All of those were a good answer. Mi je nekaj počet, da si jaz mu počet, je nekaj počet, da si jaz mu počet, kofi, mi nekaj počet. To je jedna dolača. Mi ga si počet, da mi počet, kofi, objezve, in jo svičnjem najnadnjih in pal 47 바로luhoje in NSSS, sveno, ki je prav del frolife, in prijo Curtina, Kala Koči. I am open source developer. And every day I work with Python, writing in Python for posgaz and with posgaz. If you want to tweet something, you can use the EuroPython hashtag or the PMP love hashtag for this talk. So I promised you a love story, probably one of the biggest love story of all. There is a match made in heaven between Python and posgaz. You can see the elephant and the Python in love between each other, because one of the questions that I've been asked, not only yesterday, but also in different conferences was, but Python and posgaz get along together well. And yes, a lot. They have a lot of things in common. Really a lot of things in common. First one, the first letter is a P. And they are both open source. Both of them have a huge, really, really huge community that is really, really involved in the development of the programming language and of the database. In the case of posgaz, the community is responsible for the robustness of the database. Both are powerful. Python is powerful, is flexible. Posgaz is powerful and stable. Both of them are really well documented. You can find a lot of documentation on a lot of stuff on the internet on Python and posgaz together. You know already Python, because this is a Python conference. So, please, allow me to introduce you posgaz. One important thing. The name is posgazql, or posgaz, or if you feel like a close friend, pg, okay. If you call posgaz like this, the elephant will be happy. Please, please, please, please, don't call him posgaz, or posgazql, or any other iteration between the letters of the name. Otherwise, the elephant will be confused. I'm not sure that he will answer your questions. Okay. Posgaz. Posgazql. Okay. Is the fact, though, the default database choice for many Python developers. There are a lot of developers around the world that have chosen posgaz as a default database for the application. Obviously, most projects support more than one database, but usually tests and development is done in posgaz. Why? Because posgaz is free. It's free as a speech. Posgaz is stable. Posgaz is free to reach, and we are going to see just the tip of the iceberg of the feature that are part of posgaz. And posgaz is released frequently. Posgaz is released once here with a big release with new features, and one timer every three months with bug fixes and minor features. We have a lot of features because it's free to reach. In posgaz, the first one is the MDCC that obviously is not something that only posgaz implement, but it's something that posgaz support, and it's the multiversion concurrency control that is something that makes you sure that your transactions are isolated between others. So when you do something, your database is always consistent. Because the first directive for a database is to keep your data safe, and posgaz it's particularly good in doing that. I've seen databases during my work crashing in really bad ways, but I've seen people recovering the data until the moment of the crash. So posgaz is safe. Posgaz have the transactional DDL that is something that is less common than MDCC, and basically when you do a change to your table, for example, you drop a column, if you do that on a transaction, that action is transactional. Every operational that involves changing your database is safe and transactional. So who is working on the database on that moment is safe until the transaction is committed. It is something that not happens in every kind of database. Every bigger database like Oracle does not support something like this. Posgaz support table scenaritans. Table scenaritans is something that allows you to create a father table with a lot of one or more child tables, allow you to speed your data in sub tables and access them through the father table following a logic. And from posgaz 9.5, you can do that also on remote servers. So you can split and shard your data on remote servers and split evenly the data and load on different instances of posgaz. Posgaz support full text search. Something is very important, especially for people that use Django and develop in Django. It is so important that Django have a plug-in that allows you to use full text search on posgaz easily. Then we have the replication. Replication is something that allows you to create a new server, that is a master, and clone the server on another instance, that is a standby. The standby is always synchronized with the master, and in case of disaster, if the master goes down, you can promote the standby and reduce to the minimum the down time in case of bad situation. Then posgaz supports physical backup. And this is cool because, at least for me, because I am a disaster recovery expert. Physical backup is something that makes you copy your data directory while your server is running. You don't have to stop your server to make a backup. And you're going to copy the physical file of the server and not asking for SQL dump. So everything happens when the server is running with no down time for you. So taking backups, it's easier and it's faster. Posgaz supports the point in time recovery. That's it like a kind of magic because if you're, someone drops a table at 10 o'clock in the morning, you can recover your database at 9 and 59. And restart just a second before the disaster. Then supports JSONB. JSONB is a specific type of data that was think of the people that like to use JSON, especially people that like the no SQL type of data that can store entire documents written in JSON inside your database so you can mix between a relational database and something different. And another cool thing that we will talk about it later, okay, but now I introduce it to you, are the foreign data wrapper, FDV. It allows you to connect to every kind of source of data outside the database. For example, you can connect to another database, not only Posgaz, but also MySQL, Oracle, DP2, or you can use as source a text file. You can use as source a remote service. We can use everything and import in your database as a table. And you can do queries on that, and if the remote source allows it, you can also insert data, okay, write straight inside your database. Okay, and we have also PostJS. PostJS is just partial extension that adds extra types, extra functions, extra operations, index enhancement for handling just partial data. They have, PostJS have a lot of functions that allow you to move inside the geographic area and find points that are near and is very important for people that develop using GIS. This relation between Posgaz and Python is a virtual cycle, because, as I said, a lot of Python developers choose Posgaz, and because of this, the software that is based on Posgaz, and as an example that are based on Posgaz and documentation of Posgaz, it's better than the other. So, more people is going to use them. So, Posgaz QL community grows. If the Posgaz QL community grows, the database grows stronger. If the database grows stronger, more people is going to use Python and Posgaz together, and this is a cycle. Because of this, the Python community grows, and the cycle restarts. This is important. But now it's time to talk about the love story and let's get down in business. This is a showcase of things that have been done using Python and Posgaz, and because a lot of people ask me, OK, but can you tell me at least a couple of projects that have been created using Python and Posgaz? Sorry. Yes, I can. I created a talk on that. So, fifth thing, we want to connect to Posgaz using Python. We have a lot of Python or MS, like SQL, Py, Pony or M, and Django or M, OK. They are written in Python, and they handle Posgaz, and they have one thing in common. Every ORM that I named uses PsychoPG2. No, no, clap your hands, because it's a moment of Italian pride for me, because the main developers of PsychoPG2 are Italian. Really. So, PsychoPG2 implements the Python DB API 2.0. It's based on only PQ that are the libraries, that are the standard libraries for Posgaz, and it's open source, because LG, GPL. So, it's really free. It's really easy to use. So much easy that a lot of people uses it, and I've seen projects around the internet, wrapping PsychoPG2 to create an even easier interface for that. So, it's really, really powerful. Obviously, it's not the only one. There are a lot of other drivers. These are the most famous. We have PyPosGazQL, PyGazQL, PG, 8000, OCPGDB, and a lot of words and letters together. By the way, PsychoPG2 is not the only one. There are a lot, and everyone is valid. OK. What if I tell you that you can use Python inside Posgaz? Every, let's say, serious database usually have internal procedural language that allow you to write procedure to handle your data in a faster way that you would do in an external program. OK. Posgaz allow you to load Python inside the database, and use Python in these libraries inside your database to modify your data. You can use Python to react to action like insert on update, or you can use it to prepare your data for a huge dump of data, whatever you want. You can use Posgaz in Python. Python inside the Posgaz, and write Python code inside Posgaz. And this adds the flexibility of Python straight inside Posgaz. But wait. I promised you that I would talk more about freeing data wrapper. As I said, you can use the freeing data wrapper technology to connect to Santigasternal. Usually, as I said, free data wrappers are written in C. But if you use Multicorn, that is a software that is written in Python, you can use Python to write freeing data wrappers. In 2016, in the Italian Patron conference, I've had a talk on this, and I written just for fun freeing data wrappers, where I think the longest one was 150 lines of code comments included, that allow me to connect, for example, to SoundCloud and query the service asking for songs straight inside the database, and then I was able to see the results of my research on a table. And that project started because I was really trying to organize my music collection, and then Posgaz helped me to do that. If you want to see that, you can find the code in my GitHub. There is that one. There is also video. Unfortunately, it's in Italian, so if you don't understand Italian, you can skip that. Another thing that you want after having loaded the data through freeing data wrappers, having loaded Python inside Posgaz, is to backup your database, because I really can't stress enough people of the importance of making a backup. You can do that using Python. There are a couple of software that allow you to do that. The most famous are Wally, that is a really nice program that really act nicely with FUS and S3 from Amazon. And another one that is a bit famous is Barman. So let's talk about Barman. Why you would like to use Barman? Because it's open source. It's one of the most used backup tools for Posgaz QL. It's fit or rich. It's easy to use because one of the points that the developers while developing it is to keep it simple. And it's developed by a team of nice people. How I know that? Because I'm one of those people. I'm one of the Barman developers. So let me introduce Barman for a bit. We are going to release the 2.2 version. The alpha version of the 2.2 will be released on July 17. And the killer feature that we are going to introduce, it's the EarthSync-based parallel backup that allow you to copy faster and really, really faster, because we have done some tests on that, copy a huge database on your backup server. And this is important because, like I said, more people use Python software to with Posgaz, the better the software become. And Barman is an example of that. I can say that because I worked on Barman for a lot of years. And Barman grow stronger because of the people that reported back to us that founded some features like this one. And now if Barman is stable, it's because of this. Okay. I'm not able to pronounce correctly. So I'm going to call it HA, okay. The meaning of that word is that you want your database to be available to all the people, most of the time, because if your database is down, probably the one of your concurrent is not. So the more you reduce your time, the better it is for you. How you can handle that? As I said before, we have streaming replication that allow you to synchronize more than one server with your master. Said so, how can you handle the promotion of one or more servers in case of disaster? You can use this Python software that is called Patroni, that is written by Zalando. So it's used by them. It's open source because I have an MIT license. It's written in Python and it's based on ZooKeeper or ATCD or console because you have more than one option. It depends on what you know. It's really powerful and to explain it simply, when your database shuts down your master, there is a discussion between the other nodes that decide which node will be promoted to the master and it happens almost automatically. So no more calls at the middle of the night because the database has shut down. Obviously we have to recover it, but you can do it the morning after. We have talked about database, we have talked about HA, now we want to handle our database. Maybe easily. There are a lot of tools that allow you to do that. We have the OmniDB tool that is open source and have been recently rewritten using Django. The 2.0 version of this software has been written from scratch using Django, it's cross platform and has a nice PostgreSQL editor. And this software is written by two really amazing guys from Brazil and now they work for Second Quadrant. They are my colleagues. Then we have another tool that is historically works historically near PostgreSQL. It's PG admin 4, it's open source, it's multiplatform, gives you the ability to see the plan of your query because when you create a query, PostgreSQL analyze it and decide how to act. You can see the plan that PostgreSQL created for the query and you can see if you can optimize your query to be faster to act differently. And another thing that is really, really, really useful, especially for developer, gives you the ability to debug the procedural language that is called PLPJSQL while writing the start procedure. So usually it's not easy to debug the start procedures, you can do that putting a break point on your code and step-by-step analyze it. After the graphical interface, I'll show you because Omnidb and PG admin 4 are graphical, okay, we have the command line that is the tool that most of the DBA or the sysadmin loves. We have the default client tool that is PSQL, that is part of the query of PostgreSQL, is released with PostgreSQL, it's really powerful and it's born to work with PostgreSQL by the people that develop PostgreSQL and have a lot of meta comments that allow you to perform actions like retrieving descriptions of tables, retrieving descriptions of schemas or following foreign keys or listing database just with a backslash command. Usually it's a backslash and one letter, so it's faster and easy to use. But obviously it's not perfect. I have some issues and browsing the web, searching for tools, nice tools for this presentation have discovered that exist PG CLE that is obviously less powerful than PSQL, but gives you syntax alighting, smart completion and always try to pretty print the output of the tables on your terminal. Obviously it's not like PSQL, but if you have to write just a quick query, it's faster. And the smart completion is obviously contextual of what you're writing. The last one is the workload analyzer, because you want to monitor how PostgreSQL behave. You can do that using this software that is developed by Dalibor, that is a French company that works on PostgreSQL like second quadrant, that is open source and is composed by two parts. One part is an extension that reside in PostgreSQL and is written obviously in C because it's part of PostgreSQL. And one part is the user interface that is entirely written in Python. And allow you to see real trying graphs and see performance chart so you can inspect your database while it's working and see when or why, as you understand why your database is under load on that moment. If you have a huge amount of locks of this fault of a couple of queries that are really long, for example, you can do that only monitoring your database. And this is a Python tool to do that. What I show you is just the tip of the iceberg of the possibility of the things that are around the world written in Python and PostgreSQL. What I show you is the result of love and passion and that's why I said that passion is a keyword for this talk. What I want from you is to get involved and spread love and being part of the virtual cycle that I show you before. To do that, participating in conference like this or like these two that are the next conferences on PostgreSQL. We have the Italian PGD that it's going to happen in Milan in 2017, 13 of October, or the PostgreSQL conference Europe that will happen in Warsaw on October 24, 27, 2017. In this conference, in these sevens, you can talk with people that write the code of PostgreSQL. You can talk with people that are there to help you to write better Python software with PostgreSQL because it's their job. So it's important to meet these people and to get involved in that circle of virtual cycle. Thank you. Okay. Thanks, Julio, who has questions for him. Thank you for interesting presentation. I'm curious about... I'm curious... Oh, okay. I'm curious about Paul, which should... As I understand, allows you to get some statistic how the PostgreSQL database works. Okay. No. Sorry for the technical issue. So... You heard the question. I have to repeat that. No, okay. Ask me if the workload analyzer could impact the performance of the database. Yes, it could marginally, but if you're in a situation that your database is always under either workload, okay, maybe you want to spend a bit of horsepower on try to understand what's going on. It's something you can deactivate. It's an extension. It's part of PostgreSQL, but you can turn it on and off. So you can turn it on when you need it, turn it off, not impacting your performances. Fine? Okay, good. Are there questions? So we have a table of 100 million records. We want to do a select one. So is there anything specific to PostgreSQL that's going to allow us to do faster queries and searches than, let's say, MySQL or some other database? So if we have 100 million rows, and we want to do a select query, just select one from those, is there something specific to PostgreSQL that's going to allow us to optimize the database, like clustering, indexing? I'm repeating the question because so I'm sure that everyone heard. He asked me if exist something that helps you to create faster queries. This is the sense of the question, right? Absolutely writing faster queries. Yes. As I said, PgAdmin 4, for example, makes you see the plan of a query. So you write a simple query just like a select star from something with a million rows that for process are not a lot of rows, to be honest. And you can see how PostgreSQL intends to act on that query, and this is graphical. You see what happens, and you can decide that you can work and split easily that query in a query and a sub query, or something that's more performance-like. It depends on how PostgreSQL reacts. What you are searching is just inside the database. It's part of PostgreSQL. Is there another question? By the way, I will be at the second pattern boot in the Piazza Room. So if you have questions, you can find me, or you can search for this guy, stand up, Marco, for the technical question. Or if you want to ask something that is more marketing or stuff related, you can ask to that girl that is there, or to the girl that is in to the boot. OK? OK. Is there any question down there? Ah, I didn't see you. Thanks a lot for your talk. I'll try this. Can you hear me? OK. You talked a lot about the relationship between Python and PostgreSQL, and I actually know this pretty well because we work with both at my company. But I'm wondering, for other people who don't have experience with PostgreSQL yet, but are already Python developers, do you have any good how-to material or introduction, so how they can learn more about using PostgreSQL with Python? Because I think people are pretty familiar with databases in Python, so they might be using Mongo or MySQL or something. How can they get started with PostgreSQL? Do you have any tips for them? OK. So he has got me some tips to get started in PostgreSQL, maybe migrating from MySQL or Oracle stuff. Right? OK. Probabil, the biggest tip I can give you is read the documentation, because PostgreSQL documentation is really, really organized, and you can find what you need, exactly what you need, just with a simple search on the documentation website. There are a lot of tutorials on how to install PostgreSQL, and usually for the biggest distribution is just a command line line, and getting started, it's almost pure SQL. PostgreSQL is almost pure SQL, not strange construct. So if you know generally speaking SQL theory, you are able to use PostgreSQL just out of the box. So one issue that we have when we are developing for databases is that we wind up with the schema. We may have hundreds of tables, we may have dozens of stored procedures, functions, everything like that in the database. What is the best way to organize and to manage that process, and do you know of any tools that allow you to actually manage and keep all that stuff in sync? You can do it on the command line, yes, you can do it on the command line, but are there tools that allow you, for example, that if you have a stored procedure and you make a change to that stored procedure, that you can automatically publish all the changes to the database from your repository? I can answer using that, yes, okay. There are tools, you have heard the question, yeah, okay. There are tools that allow you to do that, are not written in Python, so we're not part of my presentation. There are tools that allow you to keep track of your changes and the different schemas, see the changes, patch the schemas and reload them. Then there are techniques on using PostgreSQL that allow you to do that with less impact, but our techniques are not specific tool to do that. But in my experience working in second quadrant in these years, I've seen people doing a lot of strange stuff to do that, and I've always seen that working, so progress is tough. I can handle every kind of change and because I said, it's transactional, you can try that. No, it's not, roll back, everything is good, okay. So another question? Ah. It's more follow-up remark. It's more follow-up remark to what you just asked. There is Alembic from the same author as SQL Alchemy that is a Python tool to track schemas. He knows more, I guess, but doesn't do everything, I guess, but might be worth looking into. All right, one more. Okay. If you search on Google for PostgreSQL is awesome on GitHub, you will find a web page that has a list that is always updated of software for PostgreSQL. So you will find the tools you are asking, you will find the collection of tools for PostgreSQL, and if you don't know that, there is also Python, it's awesome. Okay, thank you. All right.