 Hi guys, my name is Tamás Bundt and I would like to talk about databases related to LibreOffice mainly. Before introducing myself I would like to start with a joke. It is about sanitizing our database access. Actually it is not quite relevant to LibreOffice base itself because we don't implement privileges when using base. So once you have the privilege to open your database access and you can see the tables, you can drop it as well and stuff like that. So I started to contribute to LibreOffice in 2016 with a sum of code. I worked on the Firebird database management system to be precise the connection to it and after that I became a contractor at Colabora Productivity and in 2018 I became a LibreOffice developer and I want to start with an overview of how it works as a developer's point of view. I would like to start with how the user sees the program. You can use base as a user friendly database access tool because of that it can be used during, for example, final exams in secondary school. We had a choice that when I made my test on informatics that I would either prove my skill with LibreOffice base or with Microsoft Access. I chose Microsoft by the way but of course I regret everything. Also you can create reports and forms with base which is a way to visualize your data. With reports you can create something to print and with forms you can create good looking something that you can insert some data with which you can insert some data to your database. Also one of my friends used base as a migration tool not because the high SQLDB migration that was created a few years ago but because there is this feature where you can open two different database connections with base and if you copy paste it one table for example you can transform your data to another database. In my point of view it's something that I can work on and this is the whole picture. This cylinder on the top left it would be a database and there is an interface called SDBC which is star office or star database connection. It's a standard interface which is created to be able to connect to every database that LibreOffice can work with and the connectivity module is used to make that connection. There is this DBXS rectangle on the right side. It's the module which implements the graphical interface and some services which used the star database connectivity and there is this thing on the bottom with an X at the end. That's the extended database connectivity interface. It can be used to create database designs. It has the functionality as SQL data definition language if you are familiar with that part of SQL. For example if you have a custom database management system which I don't know doesn't use SQL but use some crazy ways to create a table then you have to make the connection through this extended stuff. I want to talk a bit more about those boxes on the previous side. What about the database? We have a lot of ways to compare databases to each other. When talking about LibreOffice we can talk about embedded and external databases. That means that if you have an embedded database then it's bundled with the ODF structure. The external database is something that you connect to. For example you can have a MySQL server on your machine or somewhere else and you can make a connection with an ODF document. And how the server is installed on your machine it can be either bundled or system. Not just the database server but if there is a direct connection library then this library can also be bundled then system. If it's bundled then the build system of LibreOffice makes some magic and you have the database management system in your file system right next to the LibreOffice code. And if it's system then it uses your own installation. You can have direct access. For example with Firebird we use direct access. That means that no other middle tool is being inserted. And for example we can use GDBC and stuff like that. That means that we connect through another standard interface. Also we have some exotic database possibilities for example a writer document can also be seen as a database. Miklos can say more about this. You can connect to a spreadsheet as well. Or you can use a CSV file or a text file. So there are a lot of possibilities. Of course you have a lot of restrictions when you are connected to a text file if you understand what I'm saying. Okay. And there is a hello word written in SDBC. What I did here, that language is the basic language of LibreOffice. And I made maybe the simplest connection. I connected to my SQL server. You can see that from the URL in the first row. So every database management system to which we can connect has a URL. And it says it's my SQL. It's the direct connection. Okay. Local host and we have port number two. We can get the driver manager from nothing with this create you know service thing. And using the manager we can get a connection. And also you have to pass some params that I left out from the screen. You can pass your username and password and things like that. And then I create a statement and execute it. Okay. And now I want to speak about SDBC in detail. You have this driver and connection interfaces to create the connection. Which is not that interesting. But my personal favorite is the X database method data interface. Because it's one of the biggest interfaces in LibreOffice. You have around 200 methods to call. And you can execute statements with the X statement and prepare statement things. It's very similar to GDBC. Actually it's the same as in GDBC. And you can get the results with the X results interface. There are also some utility stuff. For example the SQL exception is the exception you want to catch if you want to debug something in the DBXS module. Okay. And I want to show you an example. This is my SQL dialect of SQL. With the first command you can get the table names. And what's the second one? When you connect to a database with LibreOffice, you have a screen with the tables listed. And if you click on one of the tables, then you can see the data. And in order to see those tables, we need not just the name of the table and show table statement does that. But we also need a lot of other stuff like catalogs and schema. And there might be, depending on the database management system, some problems with the query. For example, my SQL gives back the system tables as a base table. But we are in SDBC handling it as a simple table. So you have to cast it. Yeah. That's a method from X database metadata. I wanted to show you one of its methods because it represents the main idea of X database metadata. This interface wants to allow you as much information as it's possible. And maybe you have a database management system which doesn't support something. But there will be one system which will. And if you don't have the method for curing any information, then you will end up with a lot of switches and SQL statements somewhere where there shouldn't be any of those. This particular method, as far as I remember, it returns Boolean whether you can. So if you have an open statement and there was a rollback, then it returns whether you should destroy this statement or not use it anymore. Or it is possible to continue using it. Okay. This is another of my favorites. It's from X result set metadata. We have two methods to implement. One is writable and one is definitely writable. Yeah. There was already an implementation and there was a comment about it which was to-do and a question mark. Yeah. Currently every driver in LibreOffice implements the two methods the same way. Or one of the methods called the other one. I couldn't find anyone who could explain the situation where it's writable but not definitely writable. So please tell me if you know a situation for that. Yeah. When I upgraded the Firebird driver, I had some hard time with the database connections. The thing is that most of the database management systems differ in this question. There is no standard way to pass connection parameters. It's a tricky thing to do because SQL as a language is standardized. Most of its features are standardized and the database management system do the same. But before you can connect, you have to use something else because you can execute an SQL statement only if you already have a connection. So Firebird and MariaDB has a C API. You have to pass some magic C stuff there. Or at EDBC, it's quite simple because it works the same way as SDBC. But for example, what if I have a text file? Actually, I think there is nothing to do unless passing the DBX as the information that you cannot do a lot of stuff. Okay. Yeah. And another interesting difficulty I saw when I was working on the Firebird driver that was when I implemented it was about you can have relationship between tables and you have a nice graphical interface to show these relationships like you have a foreign key or something like that. And in order to query information about that kind of stuff, usually you have to make a lot of joins. For example, the method get imported keys in Firebird can be and is implemented in a way that it's a very large SQL query on the system tables and there are five joins in it. It's pretty hard to maintain but luckily the interface of SDBC never changed. So it's not that much problem. But yeah, there is a lot of data which has to be queried. Like when you want to get information on columns and you have to do it in different ways when you use a database management system. That's like the data type scale precision and I don't know. And I would like to mention the auto-incrementation question in the next slide because there are a lot of differences between a database management system when it comes to auto-incrementation. There is a standard which says that you should put this identity keyword at the end. And Firebird does something similar or I don't know, maybe it's the exact same. And my SQL has another and so on. And fun fact, I was curious what Wikipedia has in that question and it listed the Microsoft, what's that? Microsoft server SQL. So it used, it listed this dialect. And there is one cute piece before Firebird 3. There was also a possibility to create an auto-incremented column. But it was quite complicated. On the other hand, it's very close to the SQL standard because the trigger looks the same. As far as I know, but I'm not sure, there is no specification on the standard how to change the termination character. In the third and the last row, there is this set-term thing. It changes the termination character from two, what's that? Sorry? The other one. What was that? Explanation mark. Okay. Enough of that. And for example, if we want to implement the SDBCX driver as we do, then we have to say in it how the auto-incrementation keywords go into the statement. So for example, there is a method in the SDBCX interface. Actually, it's a namespace and it has an interface in it which is called table or I don't know, X table probably. And it has a method which gives back the exact create statement for a table. And if it's auto-incremented, then you have to specify that part as well. Okay. Yeah. And I didn't speak much about the DBXS part. So there is where you can find the graphical interface like wizards and views. And another thing, the magic copy-paste implementation can be found there as well. And there are a bunch of services implemented. Database connect can be used to access registered databases. So we have a registry. It's a declarative way to register database connections. So it's XML. And you have to list. And there are those database management systems listed which can be connected to the DBXS. And the data source is the service which establishes the connection. Also you can find here the XML import-export. So if you save your base thing, usually I'm not quite sure of the terminology because I could say that when you press control S, then you save the database. But it wouldn't be really correct because if you have an external database, then you only save your connection to the external database, whatever. So when you save, then this import-export thing is called. And there is the SQL DB migration code there, too. And these are the useful stuff I found when I looked deeper into the connectivity module. There is an SQL parser which is quite good. I didn't find much better implementations on the Internet which would be free software. But it lacks some functionality. For example, it doesn't handle different dialects of SQL. So it aims for the common part which is useful in a lot of cases. But for example, if you want to aim for database migration, then it has to be extended. Another thing that I don't like about it is that it parses the SQL statement and you can obtain the information but you have to use hard-coded integer to get the data. It parses the statement into a tree. And for example, if you have a create table, I don't know ID integer, I don't know primary key, then the primary keyword may be the six value. So you know what I mean. Also, you can find here something called the driver manager which helps the database context service. Not that interesting. And there is an interesting library called common tools where you can find the base implementation of SDBC, SDBCX. So these are those libraries that aims for the common part of these implementations. So once you have to make another driver, then you have to implement or you have to inherit from this one. Or at least it's useful if you do that because then you don't, you won't duplicate code. Yeah. Okay, some debugging hints. When I fix something in the base part, I started usually with putting a debug log on the SQL execution. There is only one or two places in each driver where SQL statement can be executed. So if you put a log there, that's a pretty successful idea. SQL can be used pretty well in logging because of its declarative nature. Also, usually when there is a problem, there will be an SQL exception thrown. So I usually set up my GDB with catch throw. It means that you will catch the exception which may be thrown. I was curious if I could specify GDB that I want to catch only one kind of exception. I don't know. Maybe you are more experienced than me. I couldn't do that. But tell me how to do that if you know. Please. Okay. And now I want to talk about what happened this year. We went for creating a new MySQL connector. There was already a C++ connector which we used. But it had license reasons. It had license problems. So the MySQL C++ connector couldn't be used in the core project because it was GPL. It is GPL and that cannot be put into the core project. So instead of using that, we went for the MariaDBC connector. These two things, the MariaDB and MySQL are actually the same implementation. They are two implementation for the same standard. But then we had to rewrite the whole SDBC connectivity driver to this new connector. So the picture is the following. That we have a SDBC driver which uses the MariaDBC connector, which uses a database, whatever it is. It can be MariaDB or MySQL as well. Yeah. There are some bugs which we fixed. Yeah. And we have now MySQL test suite which cannot be used when you build your project. It's not used implicitly because you need a server to run the tests. So there is an environment variable which if you pass to the unit test, then it will be running as well. And you have to start your server first. Yeah. And the problem with that is that there are a lot of servers which can be used with Libre Office. And of course, I cannot test all of them. I tried. I usually use MariaDB because that's the default on my operating system. But I managed to install MySQL next to my MariaDB implementation. Okay. And what to do now? There are these meta bugs which collect the bugs that should be fixed. There are 34 open bugs. I checked it yesterday about defaulting Firebird instead of SQLDB. For the MySQL connection, there is also a meta bug. But most of the MySQL related bugs are not connected to it. So I think it's a problem. But what I found is six bugs. But only one is open. The other five has been fixed. And there is a meta for the migration. And it holds 16 bugs. I think it interferes with the default to Firebird, not SQLDB meta bug. But I'm not sure of that. Okay. There is this question. Should it be, should the migration be experimental? And another question, where is the, don't ask me again button. There is a dialogue which handles if you want to migrate or if you want to postpone it. And we would like to implement a checkbox that please don't annoy me again. It's not yet implemented, but under slow progress. And I think it's experimental now. But I'm gladly, I would like to hear your opinions about it, of course. Yeah. And there is a cool Wikipedia page for the migration. I just found it two days ago. And it's really great. So if you are interested in the topic, just check it out. And these are the four most wanted bugs I found. There is one with corrupted numeric data. When you want to migrate, you just got wrong data. And if the user accidentally saves his connection, then it's hard to recover it. Actually, it is in the, all the data is kept in the ODF format. But you have to look at the link below to recover than your data. And there is a second bug. It looked interesting. And there are a lot of conversations about it. And I didn't quite catch what the problem is. But I'm on it. There is the don't ask me button again. And my favorite is the migration of complex views because we have to extend the SQL parser for that. And if I would be a time millionaire, I would be just right into it. Yeah. Thank you. And please ask page.