 I actually love the database. The database which is sitting at the back end. How many of you have loved that? I have a dba and I have always faced those people who goes at the database and eventually at the dba as well. But you know, I will talk a bit about myself. And this random guy looks exactly like one of you guys. Two eyes, a nose, just any random guy. And I started my career as an oracle and db2 dba. A bit of sequence server as well. Then I ended up in a job which was here in Zilagore with Ashnik where I started with Postgres SQL and later on a bit of MongoDB as well. And now I totally love open source. At Ashnik we do a lot of open source consulting and a lot of open source solutioning as well. So we work on end-to-end solutions which are totally built completely on top of an open source stack. That's my Twitter handle. So if you are a tweet about this session, if you like it, if you do like it, if you wanna curse at me, do that totally on Twitter. So even my boss can take care of that and all the people can also take care of that. Apart from technology, I love cycling. I go around in Singapore on my bikey, north, south, a bit of west as well. Those are the tracks that I have done so far. Long road biking. I love photography as well. So the only thing which restricts me is money. It will cost me good pictures because you need to have good lenses and all that. And here is why I love Postgres SQL. It is a database which is fully asset-compliant. It is a very vibrant and active community. There is an open user group where people discuss all the new features quite in public, quite openly. Even if there is a new change being pushed out to the code, it is being reviewed by people. The algorithm which is being used for sorting is discussed publicly. People can decide, people give their opinion and it is respected very well. If you want to get help, you can just shoot out an email at the user group thread and someone would look into that. Someone would come back and reply to you. You definitely are going to get help from the community. And that is one of the reasons why I consider Postgres SQL as a true open source database. Because everything is then open, you get help from the open source community as well. There are a lot of advanced features which allow you to scale up very well like multi-version concurrency control where it keeps multiple versions of the data so that you don't run into inconsistency issues. I will again not get into my GB issues and I won't bore you much with those features. It is very developer friendly and we have seen 10 reasons why it is so developer friendly. As it is built to be extended, we will see a couple of reasons why I say that it is built to be extended and why it is a very extensive database. So when it comes to developers, you are always worried about whatever I am going to develop, is it going to be supported on Postgres SQL or not or on the back-end database or not. So for that matter, all these languages that you see there are inbuilt connectors for them. Plus there are inbuilt connectors for lot more than these like you have got NodeJS connector. And it is supported on a vast area of platforms as well. So you choose a platform and Postgres would run on that because it is written in C language. So anywhere where you can compile C language, you can run Postgres. It is supported on ARM architecture. So how many of you have seen Raspberry Pi? You can run Postgres on that. It is also ported in Matiza. It can run on power platform risk architecture. It can run very well on X86 as well. It runs on Ubuntu, on CentOS, RunHat, Windows, AIX. I know people who are using it on AIX. I know customers running on AIX. So now I am going to talk about 10 features why developers should love Postgres SQL. These are those 10 features you will be missing if you are using any other proprietary database. And when I say proprietary, that is what I mean, or I can leave it to SQL server and then I might. So it has got JSON features. It has got data drive supports like money, time, Boolean, all these data drives which we are going to explore more. You can have foreign data wrappers here. Then you have got user defined operators. We will see what those are and how it works and how it is fulfilled. You have got user defined extensions as well which allows you to do more with your Postgres SQL database. An existing database where you can add features that you want to have. You can add them via extensions. And we will see a couple of examples of that as well. How people have done that. You can create filter based indexes. We will see what those are. More granular control on your parameters. So there are certain parameters which you can control as a developer. You can set certain parameters within your session. You can control those parameters at user levels as well. And we will see how those work out. Indexes can be used for gathering statistics in runtime. We will explain a little bit about why statistics are important and why this particular feature is that really helpful to you as a developer. Full text search and work upload APIs. So I will start with one feature at a time and then I will elaborate on that. Gears and features. Have any of you bought something from Amazon? Have you gone to the Amazon website? Amazon.com. Not AWS. Amazon.com where you go and buy stuff. That is where Amazon started from. How many of you have gone to Amazon websites? Not many. So you still go to the physical shops and buy stuff. You have gone to some other online websites as well. So when you go to Amazon website and you look for a phone. The features that you are going to find. On your left hand side pane there will be a feature filter. Where you will find that screen size, the camera, the operating system. Whether it is Android or the brand name. Samsung, LG, Apple. All these filters will be there. But when you go and look for shoes, you will find color. You will find size. Probably one of the filters. Brand, price range. All these kind of filters. Completely different set of filters. So can Amazon or any e-commerce industry who is trying to sell a vast array of products. Do you have a code to store all that in a relational database? No. You will end up either having too many fields or too many tables. Too many child tables. For that you need something which is flexible. That is where JSON comes in picture. So version line of earlier, one of the guys from Postgres community talked about. Satoshi talked about JSON features. So I will not elaborate much on that. But you can store JSON documents inside Postgres' scale. Two fields can be completely different. You can store eyes inside one of the fields and you can create genindexes on these fields. And these genindexes can be useful for doing searches for retrieving data from your JSON data stores or your tables. It also provides you Postgres, also provides you JSON functions and operators. Using these JSON functions and operators you can write a data access API. So how I look at it is I can write data access API and store my data base or my data in a JSON document format inside my database. As long as my data is not very formed up, my schema is not formed up. Once my schema forms up, I will translate those JSON fields into tables, fixed tables. And still I will have my JSON API returning data in JSON format. There is no impact on my actual application. My application still consumes data in JSON API or in JSON format. So that is how I can actually go agile as well. I can keep on adding fields in my JSON document or in my JSON text field as long as I am having the need for new fields. Once I have formed up those fields, I will translate them into columns and I will use JSON APIs to return data. You mean there is a temporary format for the data before it is covering to the relation of the database, right? Not really. It is actually an actual field that you are storing data. So today I know that I may need, so how it happens in agile development is you start with a functionality, right? You don't know what fields you are going to need, so you keep on adding new fields. But what constraints you down is the need to normalize, the need to have proper design of your tables, right? That is when you get a proper fixed schema. So what I am telling you is you start with a JSON field. Keep on adding all those data. Once you have got enough amount of data, once you have done enough amount of development, then you can see that these fields are what I am going to need. Other fields are not what I am going to need. Or these fields could be multi-value attributes. Or these fields are where I am going to have normalization requirement. Then normalize them. And since the beginning, return all your data in JSON format using APIs. Once even if you have made them into proper columns and tables, still going to return data in JSON format. So user data now sits in proper tables, but still the data return to application is JSON document, right? So there is no impact on the application, right? So your database could be as agile as your developer. There is vast array of data types which are supported natively, money, interval. So if I want to store like 5 minutes, this top duration is 25 minutes which I am running short on. So 25 minutes. If I want to store a duration, I can store that in interval. If I want to store time, this started at 2pm, actually 2-5pm. So I can store that in time here, right? So it allows me to store data in the way how my application perceives it. Plus I can create more data types of my own. I can define my own data types. So as a developer, I have a choice to choose more intuitive types for my data which is more compatible with my application. So I don't have to do a translation necessarily, right? There are databases who do not support Boolean. So how you at least store data inside your database is you store them as characters or you store them as integers 0 or 1. And then in your JDBC API or in your actual program, you do that translation. For active, it doesn't support Boolean for your table data types. DB2 as well I guess doesn't support Boolean as a data type for your tables. At least you have the versions which I have worked with which was 11G and 9.7 for DB2. So there is no conversion required here. You store data as you perceive data as you fetch data, right? Plus you can define your own data types and you can define your own indexing methods to deal with those data types. Like you want to store personal details and the structure. So you can create your own data type for that. Foreign data wrappers. Now this is something which I love. So you have a data store which is in MongoDB or which is in Hadoo. And you can access that data store and the entity is rather not the tables. But the entity is in those data stores as if they are posting this to your tables, right? So you can say select star from some, let's say, you know, customer details. And that would get data from MongoDB, you know. And that MongoDB would be a short end cluster. Now why is it beneficial to developers? Because you don't have to actually think about, you know, integrating data or replicating data from other data stores and pulling them together. You can choose a database of your own choice. Let's say for example, in a banking industry, you can have natural transactions which are running into your system every second or every microsecond. Each transaction is getting pushed down into your system. So those data or those records you can push into a MongoDB database. Because you're also going to do some kind of analysis of that or into a Hadoop database because you will run some offline jobs on them. But if you want to have a joint between those financial transactions and your account data, you can do that in PostVisitScale. And the good part is PostVisitScale can push the filter predicate to these data stores. So if you're not a query with a well class, that well class will be pushed down to MongoDB. It will not execute in the PostVisitScale layer. It will execute that on MongoDB. And in future releases, with further enhancements in 9.5, it is even better to push aggregates to the underlying data stores. So that is a real advantage of, you know, which you will have as a developer. You can integrate multiple environments in your system in your program itself. You don't have to depend on EVAs or you don't have to depend on a data integrator guy who is going to integrate all that data for you. You can build a logical data hub for yourself as a developer. This is another thing I work at YouthSquare extensively because my job also involves a lot of migrations, a lot of conversions. So you can define an operator of your own choice. So here, I'm defining a different meaning for equal to operator. And here I'm telling that it is going to do a wire care equal to integer. So I want to compare wire cares to integers or wire cares of strings to floats. If I want to do those kind of comparisons, PostVisit doesn't do that comparison natively. That is not facilitated by PostVisitScale itself. But the good part is you can write a very simple operator yourself to do that. And then once this operator is created, it is added in the library. Any kind of operation which matches these two data writes, it will be performing using the equality operator. That will be performed with this function will be executed and you can define your own rules there. I can also replace an existing function. So equal to generally just a string match where it is case sensitive. I want to do a case insensitive search on my database. So I can define a function and I can override the existing equal equality operator. So you can define operators of your own. You can define operators for a data type that you have defined yourself. It allows you to also do a lot of, it allows you to do a lot of migration stuff because I do a lot of migration. So if there is a particular functionality, for example, it is very funny. Every other database when you talk about any database, all of them use double pipe as concatenation operator. With an exception, SQL server. For some very good or bad reason, Microsoft chose to use plus as the concatenation operator. It always does concatenation that way. Now if I am migrating something out of SQL server to Postgres, or when I say SQL server to Oracle, I have to change my code everywhere. With Postgres as clear, I have the flexibility to define an operator plus which is the concatenation for me. So I don't have to change anything in my code. As a developer, I have choice of defining my own data type and my own operators for those data types. Plus I can create new operators for existing data type. User defined extensions. You can extend the existing functionality by adding your own add-on modules. These add-on modules are once you have compiled them and added to the database, they work as if it is like a native feature of the database. It is like a native Postgres SQL feature. I will talk about some of the popular extensions. PgPremom, we just got added in 904. Using this extension, you can preload the data in your buffer. So before you start connecting your application, you can load the data in buffers. So you don't have to face cold redo tissues. Then PgCrypto, using which you can have some encrypted data in your database. You can encrypt the data before you are storing. Particularly important for banking and financial industry. PgShart, this extension allows you to create a sharded cluster of Postgres. So you can have a distributed database. Some of your data could be sitting on one node, some of it could be sitting on the second node. And this is by hacking into the actual execution. So how SQLs are executed, that also has been extended in this particular extension. And this is provided by CitusDB. Postgres, sorry, that's a typo. No, that is not a typo. It is Postgres. Postgres, it is not just, it is GIS. Postgres allows you to create a complete special database inside Postgres SQL. Then you can use, you know, there are other extensions to play around with the internals of Postgres, how buffers are stored, how dead, which rows are dead, all that information. Plus there are more enhanced text search extensions for fuzzy string and trigram matches. You can create partial indexes in Postgres SQL. So, you know, there are three times what you do. You don't actually delete data from your application. What you do, you do a soft delete. Just mark that row as deleted. You say N or delete is Y or no more or expired is, yes, something like that. You don't actually delete the data because of various, you know, regulatory rules and all those things. And whenever you access your data, you always access where is deleted is false or where expired is false. So, in those cases, these indexes are really helpful because I can create an index on my accounts table where I can say that closed is false. So, it is only on a cover these rows where closed is false. It is not going to cover any other row. Plus I can create smaller indexes on different datasets. So, I can say create an index on accounts table where account type is current. Another index where account type is savings. So, these indexes are going to be fairly small and those smaller indexes will perform faster. I can control the parameters at very granular level. So, I can control them at database level or at user level. So, within the same instance, I can have two different databases. One for reporting and one for OLTP and I can have different parameters for each of them. I can control them at user level. I can have a user who is going to run batches and I can have larger sort memory for that user. Or I can set consistency parameters to be low for that user so that I get a performance boost. I can set them in connection or session level. So, just before doing a bulk upload, I can set synchronous commit off. So, which will give my bulk upload a boost. This scale can get all these statistics on the fly. So, why statistics are important? Because these statistics are a way your optimizer decides how to query the data. Like if your table has got 1 million rows, your optimizer will decide that it is not going to do a table scan. But if your table has got 10 rows, it will do a table scan. But let's say you have got a highly volatile table which grows from 10 rows to 1 million rows within 10 minutes. So, statistics are not going to get collected instantly. They are collected offline or they are collected, you know, once a sequence is run, right. So, that is where PostgreSQL helps you because as a developer, you don't really have to, you know, worry about collecting these stats after every bulk upload. So, the first thing which comes into my mind as a developer is if statistics are so important, do I need to do a stats collection after every bulk upload? Not really. You can actually rely on PostgreSQL's optimizer which is smart enough to do a quick check on the indexes. So, I will have some indexes on these tables. These indexes will store some indicative data about my stats as well. And checking the same thing on the table rather than checking on the index would be costly. So, I will do a quick check on the indexes to find out how many potential rows are there. And based on that, I will decide whether to do a table scan or index scan or what joint algorithm is to be used. So, as a developer, this is something which is integral to the system. As a developer, you may not see this very useful to you or tangible to you, but it is really helpful to you as a developer. It avoids that extra step of doing the, you know, stats update in your database. There is a JDBC copy API for PostgreSQL which allows you to do bulk upload. Copy is a command in PostgreSQL which allows you to copy data from flat files into the database. With JDBC API, you can programmatically do that. So, you can write a program which reads a file or which reads inputs from standard input device and loads it into the database. It is quite efficient, works better than the insert statements or even the parameterized queries, right? Because it does it more faster way, in a much more faster way. The last thing, which is again quite useful to you as a developer, you can use PostgreSQL's functions, operators to write an application which does full text search. So, PostgreSQL is capable of facilitating full text search. You can store your data, extra data in PostgreSQL and you can create GIN indexes on that. GIN indexes are, but it is not that GIN which you go out and import and write. It is, I don't really remember the complete explanation for that. But they are very helpful in creating indexes on textual data. It doesn't index the bold text, it indexes in a much more smarter way. And then you can use them to search your data, to do pattern searches like regular expression searches, which contains, let's say semi somewhere in middle of paragraph. That is where GIN indexes are really helpful. It makes your life easier, you don't have to deal with another component which does just full text search. You can do that in the database itself, in the database features itself. You can also do full text matches or sound text matches like SAM and SAM, they sound quite similar. You want to do those kind of searches. You have extensions for those, you want to do full text search, you have extensions for those. You can do trigram matches, you have extensions for those. If you want to explore more and more PostgreSQL, there are further blocks from various community developers. Joe's Purpose, Bruce Baumgärd, Simon Reitz, Manus Anjander. And then there is an aggregator, Planet PostgreSQL.org, where you can find logs being aggregated. So you can go and search for the feature that you are looking for. Another thing which I do quite often is search in the community threads. There are user groups, email lists, you can search through those discussions. A lot of your problems will get solved by just looking at the discussions. Because they get solved by problems not many times. Then of course the big book of PostgreSQL, the online documentation on PostgreSQL.org, that is always helpful. And if you have got any questions, just give me a few minutes if people have got any questions. What's the license of PostgreSQL? It is an open source license, it is PostgreSQL license. The license is PostgreSQL. It is pretty much like BST, it is pretty much like open BST. Yes, you can. Plus if you would like to have support, there are a lot of companies which provide support for PostgreSQL. In the 5G, 2nd quarter, there are companies based out of Japan as well, uptime technologies. So he is here. Plus we also do a lot of support for our services. I think the license is one of the reasons why many companies try to give us some resources. They don't give us a lot of resources there or something like that. So it is a good starting point for companies. There are a lot of resources there. We started in that case. So MongoDB is built for a completely different purpose. That is where you have to do distributed data. You have to run parallel or distributed computing. That is where MongoDB is meant to be placed. Whereas PostgreSQL is more when you have scalability in terms of number of users connecting, number of users writing or reading from the database. And you still want some consistent data. You want asset compliance in your publication. You don't want to push that asset compliance thing being built into the application. That is where you use PostgreSQL. So they are used for two different purpose. You define your reasons why you want a data store and what you want from a data store and then you choose one of them.