 What's going on everybody? Welcome back to another video. Today we're going to be solving a very hard sequel technical interview question. Now if you've been following along with the entire series we had videos on easy, medium and hard sequel technical interview questions and we solved two questions in each of those videos but we are on to the very hard questions. These very hard questions are in fact very difficult so I'm only going to be doing one although there's multiple on the platform that you can try but I'm just going to be doing one because it's going to take a long time to solve it. Now practicing these questions is meant to help you learn as well as feel comfortable and get ready for these technical interviews that you're going to get as a data analyst. The easy and medium questions are more geared toward entry-level beginners or maybe even mid-level for the medium questions. The hard questions are geared more toward mid-level or senior level data analysts and then there's the very hard I don't think that you're going to get a technical interview this difficult. I know I haven't nor have I ever given one even though I've given tons of technical interviews before. I've never gotten a question this hard. These are more of a challenge. Really a challenge of can you figure this out because it's pretty difficult so I hope that you find this really interesting. I want you to try it out but with that being said let's jump on to my screen. All right so we're here on Analyst Builder. Let's go over to the questions page. Let's filter down to very hard. Now I will note these are not under the free tier. If you go to the free the very very hard ones are not under the free tier. So if you want to try these these are the very hard ones we have consecutive visits, Twitter addiction, employee hierarchy, biggest spenders, complex address, Uber cancellation rates. Now today we're going to be trying this Complex address but if you want to try out any of these other ones head over to AnalystBuilder.com. They are super super fun but let's try out this complex address question. It says you are given a database containing customer addresses. Write a query to break out the address column into separate columns for street, city, state, and postal code. Note some addresses may have additional unit or suite information for example suite 5a or unit b which should not be included as part of the street. So let's go down here. Let's look at the addresses. We have 123 Main Street Suite 5a so that's for example that suite 5a should not be included it says. Then we have New York that's the city. I'm guessing that's New York City. Then it's New York then 12345. Then the same thing Minneapolis is the city. Then we have the state and we have the zip code. I think that's all we need to you know the postal code. So let's start making some notes here. So we have to the output needs to be street and I should just copy this street all the way down to postal code. There we go. So that's what our output needs to be. There's nothing on ordering. I think the most difficult part is going to be just breaking it out. So breaking everything out. Now how are we going to do this? There's one main way that I would be doing this and this is using sub string. Now I've done this a thousand times in my real job. This is an extremely realistic thing. Happens all the time. Data comes in just like this or sometimes separated by commas or just spaces and you have to figure that out right. So this is interesting because this is separated it's just a space. These are separated by these dashes and then we can't include this 5a. So I'm going to use sub string and we'll see if this works. But with this we can choose our delimiter and that's really important. So we can say whether it's a space whether it's a dash or something like that. But we also have to note I'll just write note right here can't include things like sweet or unit. So we have to remove that somehow. Let's go ahead and pull up the data over here and let's take a look and I wanted to do this in MySQL. Now you can do this in Python. You can do this in Microsoft SQL. I had started out writing this in PostgreSQL which the syntax is actually a little bit potentially maybe a little bit different. Actually it's different in Microsoft SQL server. I believe PostgreSQL the substring is the same. Don't quote me on that but we're going to be using MySQL because I've been using that throughout the entire series. But if you want to you can use Python MySQL PostgreSQL Microsoft SQL server whichever one you feel comfortable using. Now let's run this. So I'm going to keep everything here but I'm going to add to it so we can confirm that the it's accurate like our output is correct. So that's why I'm going to keep the everything there. Now we're going to use this substring and the first one that we have to figure out is this one right here. Now it should be fairly easy with something like one that does not include sweet 5a. And for example we would just do substring and we have to pass through some parameters. The first one that we need to pass through is just the string. Now this entire string is kept in address. So when I say we're passing through the string we're passing through a column where it has multiple rows with strings in it. That's all that's all I'm saying. The next parameter is our delimiter. A delimiter is something that how are we separating this out from itself. So I'm going to put in quotes I'm going to put a dash. And then the next parameter is where are we starting. So are we looking at the first delimiter the second delimiter the third fourth fifth because this one has multiple delimiter this one has one right here and it has one over here. So if I put a one here we get null and that's because I wrote substring we actually need substring underscore index. I'm thinking of python in python I'd be using substring in my sequel I need substring index. There we go. So now we're doing this on the first delimiter right here but if I change this to two now we're looking at this delimiter. So that's the first delimiter second delimiter. So we only want the first delimiter but here's the issue with this we have and there's a unit right here so we have one that has unit and we have one that's sweet five a here's what we need to do we have to get rid of it if it has a sweet five a or if it has unit B. So we'll need a case statement for this what we'll write is we'll write case and this will be our else. So if it has a sweet in it or it has a unit in it we will use the substring index on it but if it doesn't we're just going to treat it as normal this is like our normal one. So I'm gonna say and we'll do it right here I'll have to form out this in a little bit but we'll say when when the address is like and now we're looking for a pattern we're gonna search does this pattern exist in this. Now it's possible in some instances if you're using like a million rows of data sweet could be in like one two three sweet street I've never seen that before I don't think we need to take that into account today but that might be something to consider in a real-world example but this is a real-world example but we're just going to include sweet and I'm going to include and I'm going to use these wild cards so this means anything can come before this anything can come after this it just has to have a space sweet now you don't have to add the that you could do it just like that it makes more sense to me because we actually need to remove that in the future and I'll explain that a little bit but when that happens when there is a sweet in there what do we want to do well we need to use substring index on it but we cannot use the dash the dash has got this in trouble last time it kept the sweet 5a in there what we need to do is we need to use this sweet as our delimiter so then when it gets that delimiter it's not included right so then we come over here we put the sweet in there let's run it and I got a syntax error and that's because we need an end and we need and we can label it as well but this will be our street we have to have an end to signify that the case is done I'm pretty sure that's the issue there we go and that fixed it that's because like the delimiter with the dash when it got to the dash it took everything before it so now if we find a sweet we're using that space sweet as the delimiter now if we put this as the delimiter you may not be able to see it but there should be a space right here and that might cause dirty data I'm gonna actually I'm gonna keep it like that we'll see if my hypothesis is correct now we're gonna do the exact same thing except now we're looking for which one was it unit so if it includes sweet 5a or unit b those are the examples we need to get rid of that and we'll use unit and let's run it and there we go so this looks perfect this looks exactly like what we should be getting in our output so now uh whoops now we need to come here down to the street so now we have the street but we need to get next we need the city so the city is right after the street but here's the thing it's in the middle and if you've ever worked with data like this it's a little bit tricky because uh a delimiter only goes to one point right so what we can do is use a double uh substring index so we're collecting the substring index and then uh within that text of that substring index we do another substring index so I think that's what we need to do here so let's come down here and we can just copy this because it's already written out for us um and let's run this now what if we say 2 here let's run this um we can do it this way and then go backward we can do a minus so in substring index instead of a positive one looking forward that's starting from um the left hand side of the string and looking this way for the first one we can do negative which starts from the right hand side and looks this way so left to right when it's positive right to left when it's negative so then we can wrap this as a substring index and now this whole thing right here is our string which is this this is our string so now we're going to look backward we're going to do minus one um actually our our delimiter first is a dash because we're looking to this dash and then we need to go to negative one and let's try this out and there we go New York Minneapolis Goldsboro Maples Flower Town so this looks perfect so let's keep that exactly how we have it now for the next one that we need and I'm going to label this as city now we're going to copy this whole thing bring it right down here we'll do this as state now now obviously this isn't going to be our answer but we can run it so here's what we need to do before we took this whole string and we got here and then we went backward to this delimiter and selected New York what we now need to do is we need to go we need to select this New York Minneapolis now there's a space here and so what we should do is we should go backward to this index right there I think we need to do that first so we'll use that as our our starting place and so we'll do negative one um yeah so I'll just do it so we'll go negative one but then we need to go forward so this is going to be our string right here New York one two three or five so then we need to go forward and our delimiter should be a space let's do a space right here let's run this and there we go we have New York Minneapolis North Carolina Massachusetts I think in Florida so that's our state and we have one more and that's going to be our very last one now this one should actually be a little bit simpler because we're just starting at the very end it's not in the middle which is a little bit tricky right so we're going to come in here we're going to say as and this is postal code I really want to stick to exactly what they told us to call them I don't want to go changing it so now we're looking for a space delimiter but we're looking backward negative one so we're starting from the right hand side and going to the first space and that should give us that one two three four five let's run this and there we go now I'm going to get rid of everything and let me actually come up here I'm going to get rid of everything and let's run this so now we have the street city state and postal code this all looks correct I don't know I genuinely don't know if I check this answer if it's going to be correct or not my hypothesis what I think is going to happen is is is not going to be correct because I think somewhere in our output we have some spaces that we can't see for example this suite if we're using just the suite and not the space suite I think we have an extra space at the end of this one right here so it goes sweet space which I don't think is correct and it shouldn't you shouldn't have that an actual output in a real database you don't want leading or trailing spaces it's dirty data so let's try checking it and there we go our answer is wrong let's try just fixing this and seeing if that's it if not it could also be one of these but let's try this one now there we go so that is that was the exact issue and it's really hard to catch if you're not looking for it you may not catch it because it is a little bit tricky but that is let me see if I added this in as a hint maybe I did maybe I didn't but that is something that you need to account for in real data so I know that there were people in the discord if you haven't joined analyst builder and try out these questions we have a we have a discord with like 2000 people in it and this question when people have been trying to solve it have been really having issues with this exact thing it looks correct but it's not correct that is a real world solution a real world issue and so that is how you solve this question now you know I have a whole video explanation on how to do this as well as you can just come down here let's look at the solution so yeah this is exactly how I wrote it there are other ways to write this actually but that is how you can solve it now remember you can come in here even in the in the questions and go to the difficulty and you can take a look you can check out all these other questions and see what they look like and what the questions are these are really tough for example this uber cancellation rates is a really interesting run find the cancellation rates of requests with unbanned users both client and driver must not be banned each day between 2023 1223 and 1225 around the cancellation rates to two decimal points so this is another really interesting question and there's actually two tables here I'm not doing this one today but these are more questions I'll do more of these videos in the future just because I really love them they're super fun but if you want to try that question out I will leave a link in the description you can go ahead and check that out if you have not tried out analyst builder already I highly recommend it I created all the content myself all the questions all the courses are all done by me so if you like my youtube channel you will love analyst builders all premium really high quality content so go ahead and try that out thank you guys so much for watching I really appreciate it if you like this video be sure to like and subscribe below and I'll see you in the next video