 Okay. They say there's two things that most people fear more than anything else. Public speaking and giving a party where nobody shows up. So thank you for coming to my party. Okay. What we're going to be talking about today is a very fascinating process or thing that I got to do for command prompt for a company, where we migrated them from Sybase, ASC 15.7 to Postgres 9.3. It was a very interesting journey. So I'm Jeff Ross. I live in Montana. That's my lady Laura in the picture there. I'm a DBA at command prompt, and that picture is also from our deck. So that's the view we get all the time. Yeah. People said, you don't want to go to New York City? Well, not to leave that. So this will be a story of one migration from Sybase to Postgres, but it won't be the same as your story if you ever get to do a migration like this, because I'm sure every one of them is different. But I'll tell you what we did and why we did it and how we ended up doing it, and maybe that'll give you guys a jumpstart on doing one of these. So here's what our requirements were from the customer. Move from Sybase ASC 15.7 to 9.3 in an eight-hour window. This was about a 50-gigabyte database under pretty heavy load, and they have actually three data centers, and one in Australia, one in Europe, and one in Virginia. The Virginia one was the great big one. The other two, they were able to do in substantially less amount of time. They provided us with a Postgres schema, and the reason they did that was that all their application code was built for Sybase, which is case-sensitive. So they could uppercase everything, or uppercase table names, and it had to match that. So when they made a schema, they put an underlying schema underneath that was regular Postgres, and they made views on top of that that were also camel cased, so uppercase and lowercase letters. But Postgres folded that all down into lowercase anyway. So as far as Sybase was concerned, they didn't have to, or as far as their apps were concerned, they didn't have to change anything. But as far as Postgres worked in, it just was automatic that it went right in. So then the third thing that was really the most interesting was they wanted to, after they went live, track all changes that we made to Postgres in case they had to fall back to Sybase. Because there were, you know, this was a big organization. There were a lot of corporate layers, and it was a big selling job that they, that their people had to do to try and even get Post, to try Postgres. So they, the corporate people said we wanna fall back. Just in case it doesn't work, we wanna be able to roll back to Sybase, but we don't wanna lose anything in the process. So we had to track all the changes, inserts, updates, and deletes, and be able to put them into a form that they could reinsert back into Sybase and just take right off again. So, first steps. I was told that I was gonna get to work on this project, and it was gonna be converting Sybase to Postgres, and I go, Sybase? I'd heard of it. That's about it. So the first thing I had to do was actually learn how to connect to Sybase. We had to do some VPN stuff to get set up with the company. Once I got into that, you know, it's like then I'm in a Sybase box, but I have no idea how to launch a client, how to connect to it. So that was the first thing. Sybase has documentation online. They have about a million iterations of basically the same thing, but it is pretty awful. I mean, when we're coming from a Postgres background where our documentation is bar none superb, for in every way you can imagine, to come and look at the Sybase documentation was like night and day. Plus, their command line tools are really awful. The command line client for Sybase is called ISQL. It doesn't even have a read line. It's like Oracle's command line client. So it was, you know, as I read one person said, it was never meant to be used for anything other than to say that we have a command line client because you really can't work from it at all. So SQSH is replacement command line for ISQL. It's much better, but it still has problems. The problems, oops, sorry, the problems were is that it's built on free TDS and free TDS is a pretty buggy library. So there were things like, you know, we'd use SQSH to connect to Sybase, but if you're trying to run a long running query on it, it would just time out. So I actually end up using ISQL for a lot of the things that we're gonna be doing just because SQSH couldn't handle it. But if anything, when I was learning Sybase, SQSH was a much better replacement. So my first thought after I got that far was, in Montana, there's a road sign that I've seen. And it says, choose your rut carefully because you're gonna be in it for the next five miles. So that's not actually the Rudder Road, that's my nephew's mud bog truck where we were at that day, but it's a good image of that. So our first attempt was that a TDS foreign data wrapper had very recently been released. It was just like maybe out two weeks before we started on this project. So you guys know what foreign data wrappers are and how they work, okay, everybody? All right, so then I won't go into how all they work and all that kind of stuff, but there again, it's a free TDS and free TDS development packages needed same as SQSH, so it has inherited the same bugs as the free TDS libraries. So, but it lets you do seamless data transfer from Sybase to Postgres. So when you create the foreign data or the server, you're actually talking to Sybase and it does on the fly data conversion for most data types, except for timestamps. So that looked like it was gonna be pretty handy to have. Here's how you make a foreign server, but we'll probably just kind of skip over this, but it's basic stuff, it's all on our Postgres documentations, you just create an extension for the TDS foreign data wrapper and then go for it. So that would be how you would do it if this was actually gonna work. But it wouldn't. And it wouldn't because it works, but it was excruciatingly slow and we had gazillions of rows that needed to be migrated. So I didn't see any way that we could do that conversion or do this kind of data import with the foreign data wrapper. It just was gonna be way too slow. Plus, what we talked about with free TDS buggy, it truncates your data at certain times, it'll lose connection. So, I mean, and it had, there were some other things I don't even remember, I don't have the details anymore, but they had all to do with Univaricare and Unitext, which in Sybase is even more interesting. We'll get to that in a minute. So it was time to rethink, but fortunately it wasn't too late to change ruts. We'd just barely gotten into the progress, so there was a nice high spot that we could jump out of that rut. So I started looking around and there were a few success stories that were available on the web. Places where people would talk about how they'd done it and the best ones that I found that gave us the most detail used the Sybase BCP program to copy data from Sybase, but you have to write it to a file and then they would secure copy those files to the Postgres server and import them into PSQL. When you're talking 50 gigabytes, that is a lot of delay from the time you start until the time you can secure copy those across or even cat them across into Postgres. It's like an extra step that we really wanted to avoid. So BCP is sort of like copy, but it has some really interesting limitations. For one thing, it can't write the standard out. It has to write to a file. Another thing is it doesn't know how to do CSV. So we had to do, and there's no other way you can quote things. So we had to do the quoting for it and there was no data type conversion whatsoever at all. It just takes the stuff because it's meant to transfer from one Sybase to another Sybase. So there were some interesting hitches in the get along ahead. Well, so what I got to thinking. All right, so we got to write to a file, but we don't want to write to a real file. What else is like a file? A FIFO. A FIFO is a first in, first out pipe. It's a named pipe. So you actually make a file with the make FIFO temp Sybase FIFO was how I did it. And then when you write to that pipe, it actually will send the data across the pipe to whatever's reading it on the other end. So FIFOs work that work great. It was a lifesaver in this case. So some FIFO whisms that we learned along the way is you have to start reading from the FIFO before you start writing to the FIFO. If you don't, if you start writing to the FIFO you'll fill up the buffer pretty quick and then everything just stalls. And it's sometimes I couldn't get it to restart once we connected to it. So it was easy to start reading to the FIFO before there's anything being written to it and then it just picks it up in midstream. We found some null bytes in the data coming out of Sybase. And our folks at the company had no idea how they got there, but what happened was as soon as that null byte went into the pipe it stopped it did. It just completely killed the FIFO. So we had to fix that. So here's how we had to fix BCP to do the things we needed it to do. Couldn't do CSV so we had to do all the escaping for it. So I created Sybase views to do that and I'll show you that in a minute. So we were using SQSH to query the Sybase equivalent of the information schema and actually pull column names and the data types from that. Then I wrote a Python script to massage that data and put it all in the form we needed. So our BCP options we used was you gotta make even those these things say like minus C is care types. Well, you really gotta do that all the time with BCP otherwise it doesn't work. So then T we set the column delimiter to a comma because we're gonna do CSV. The hyphen J is to let BCP know that it was sending UTF-8 otherwise it didn't wanna read anything. And T was the truncation length which is automatically set to 16K. So we just set that as the default. Yeah, truncate your data. That's a great database. So here's how we did the data conversion in the view. There's a convert function built into Sybase that you can convert a data type to another data type sort of like casting, but it's a little different. And some of the things that we were converting had to be run through the convert. We couldn't do a cast and it is really slow. So tables that we had to do a lot of conversions and it took quite a bit longer to import than we would have had to if they, if we'd had a better way to cast that data. So ints and big ints were okay. No change needed there. Care and vericare we had to because we couldn't quote the thing. We had to start off with a quote and then trim out all the extra crap on the end of the vericare column that we were using. String replace any single quotes or any examples of a double quote in the text like if somebody said, you know, he said, quote, quote this we had to change that to two quotes and add the other quote on the end. So basically we're doing the same thing that a regular CSV conversion would do except we had to do it by hand. Well, are in this script. Univericare and Unitext also have to run through convert. And you can see there where you convert it to a vericare at 16K and in the Univeric column and still put the quotes around it. So anything that was over 16,384 unicode bytes which is actually a double byte. So 8K was truncated, that's a problem. Timestamps also we had to convert because they look like a timestamp in Sybase but Postgres didn't know how to read them so you can convert to a different more uniform type timestamp with the convert thing. And it was like you had tell it how long the string was you're converting which is 19 characters long and to choose the output format of 23 which is that's all Sybase specific stuff. So here's how we set up the FIFO. I made a command that said pipe all the stuff through use said to take out the null bytes because we don't want them to get into the thing at all. That's all coming through standard out. So pipe that into Postgres and then that's all our connection string and then just copy the Postgres table with all the columns listed out in our string from standard end with CSV and put it in the SES database. Next line is where we start reading from the FIFO and piping that across SSH to Postgres user then run the command and background it so that it just automatically goes. Yeah. You can't compress. It does text output if you run it through the conversion. We might be able to have done that. We might be able to get compressed data. Yeah. Yeah. Where were you when I needed that input? Ha ha ha ha. Time. Time what? Time to transfer because it is compressed. Yeah. Yeah. Well, once we got everything working we were transferring in pretty good speed. So we didn't have a problem with speed then except for the conversion process and it was the conversion process that was the slow part. Can you convert the time that was in the data? Right. I tried that except that then I would have had to figure how to input it into Postgres without quoting everything. So it was like one of those things that we kind of standard on pretty soon that we would use CSV which should be standard and pipe it across and go from there. But yeah, I mean we tried a lot of different variations on that and this seemed to work and it was pretty easy to implement. Once I wrapped my head around what we actually had to do. So we put it all together. We used ISQL to connect Sybase and drop any previous views that may have been there from another previous attempt because it won't, you can't create or replace a view in Sybase. So we had to drop the view, use ISQL to create a view that was specifically designed to not only order the columns but convert the data and do the CSV conversion or the, yeah with the quotes. Set up the FIFO, then use BCP to copy the data out to the FIFO and then which would pipe the data over to SSH on PSQL, to PSQL on the Postgres server. So that was our basic process and once I got that working with one table which is all I was worried about initially was getting the thing working, then I remember it was like we had an aha moment and I even remember saying in the chat room it's like, you know what? We might actually be able to pull this off because up until that point I hadn't figured out a good way to get the data across to Postgres yet. So we were gonna do that for all the tables except that some of the tables were over 100 million rows in Sybase. So we decided to partition those on IDs right off the get go. Some of the tables, the schemas didn't match. So like in Sybase they would have a data type of one kind and then we couldn't convert to the data type that they had set it up in Postgres. I had to write a special case for that which is what I did in this massive Postgres script that I'll show you at the end that to make a special case for those instances. Then the other tables that we could do that we worked pretty well. So like any of these kind of projects it was test, fix, test, reiterate, just keep iterating over and over and over again until it starts running. What really was helpful is that they had a nightly applications QA server that we could connect to which only had like that day's snapshot of data. So it was lots of data that we could play with but it wasn't the 100 million rows. And there were still enough outliers in that data that anything that we were gonna see later on down the line would show up and trip the FIFO or trip Postgres or something like that. So then I had another thing I'd have to go in and try and figure out how to fix. So I was really happy with that and they continued to test that as I went on to work on the rollback script. So what we decided to do was write a trigger and do it after the insert into Postgres that would catch any inserts, updates, leads to standard trigger but decided to do it inside a new schema so it was separate from the regular database so that when we got done we just dropped the changes schema and it's all over with. All that stuff is gone then. Because speed was more of the essence than in the way we were just making all these scripts to import data from I actually made a call map table. So it mapped the Cybase tables and its column name. Take eight be eight, be eight, be eight. Eight be eight, be eight, be eight. Eight be eight, be eight, be eight. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah.