 Awesome, let's get started. Hi everyone, I am Rasagya. I am here to talk about maps, data and how to get data on maps or in maps. Just to kind of wake everybody up a bit and see whether everybody is actually listening to me. Quick raise of hands, how many of you used a map to come to MLR convention? Great, so I think people are there in the last seat and here I mean people from Israel. Cool, second question, how many of you used a map to come to this auditorium? Not so many, like few of us probably using the Haskig app to figure out how the venue is, but few of us already built a map in our head. So we had like a mental map of how this place is, some of us are Haskig regulars, so they already know this place very well. The third question, which is obviously the start of the talk is how many of you have used data and put them on a map to show in a visualization or in a project or an attachment? Awesome, okay great. So that means that a lot of you have already seen some of this stuff which is great, a lot of you have been trying to do this stuff which is also great and some of you probably would be scared by some jargon which hopefully by the end of the talk you would not be scared of. So to quickly start off, I work at AppOps. It's a platform for developers and designers to create basically different kind of stuff with maps. You would want to make visualizations, you would want to make a dashboard, you would want to make a tool which shows how delivery is happening or the people here are using Snapchat, you want to use Snapchat and see where plays a different people's photos. You would use Mapbox or a tool like Mapbox. So the way we like to call it it's like a Lego building pieces, take different blocks, put them together and make what you want to make. So this is roughly what Mapbox is. And my work at Mapbox, I work as a designer. So my work is a lot in making custom maps. So we sit and refine details and details of what shows up on a map and sometimes it leads to really good results. So if you would see for example the Fifth Elephant website, the map looks beautiful because it feels part of the whole website has this nice scale color. So custom maps is fun but this talk isn't so much about custom maps, it's much more on visualization. So the other kind of stuff that I tend to do is to see how you could use data and put that on a map and try and see if that gives a more insightful view than say making conventional charts like R charts or 5 charts if you are the one behind. And I also try to play around with using maps as art. That's a separate topic. I spoke about it a while back with the elephant but people used to catch me that's the kind of stuff you like to talk about as well. So let's get started. With all of the stuff that I keep doing, I tend to get two questions most often. The first question is what? Like, hey, I have this data, what can I do? Which essentially means what are the different types of charts or visualizations that I can make with the kind of data that I have. Now the common charts that we already know work very well. There's so much written about it. There's also like a very seminal paper called A Tour of the Visualizations Zoo. I'm hoping some of you have read it or heard about it. That's partially where the talk title is taken from, a voyage into visualizations. It's made from Jeffrey Hayes' seminal paper. But yeah, we all know about conventional charts. But some of us are not so familiar with what to do with a map, right? Or how do you approach that? So what can we do? And the second part is you've already figured out how you want to show your data is to think about how do I do it well? Or maybe if I'm making say a choreograph or a cartogram or do I keep in mind so that I can do this well. And that's roughly the two things that I'm going to talk about. A very quick list of things that I want to pick. So I don't want a jargon drop, but I want to use this as an itinerary for this talk today. So going to go and keep checking one, one, one thing of a list and see how we're doing it. So quickly start off, talk a bit about maps, about things that are there that you would want to put on a map and then move forward. Alright, so let's get started. This is a very typical map visualization. This shows basically different bus stops that are there in Mumbai. You probably wouldn't recognize the shape. What I want you to think about is the layers that are happening here. So currently that you can see there are two major there. There is this data of these nice firefly like dots that have been shown on a map. And then there is the underlined data that's actually the mapping, right? There's a shape of Mumbai that you all can recognize. There is a label which tells you that it is Mumbai. There are these small airport symbols and there are roads etc etc. So for any visualization that I'm making, I need to think about two things. I need to think about the geo aspect. And by geo I mean what is the map? What does the map look like? Where will the data go on top of? And second is the actual data. That is your own data. You have like a CSV, a topo JSON, a geo JSON, a different sort of data sets that you already have and how do I get that onto a map? So the reason I'm talking about this and that's a very small segue that I want to do is just get everybody on the same page. And also like break these very obvious terms that people generally talk about in geo visualization which I think is very easy for all of us to know. So let's just quickly think about it. First, there are three types of things that you can make on a map or you can show on a map. They can be nodes, ways, relations, which are just fancy terms for saying dots, lines or like squares or molecules, right? So you would have saved places. For example, how many different MLR convention centers are there in Bangalore? Each will be represented as a top and that works very well. You would probably have lines for example, what is the path to one MLR to another MLR? How is it traffic on top of it? Which one is an alternate route? Or are those are lines? And then you would have polygons. For example, how big is the MLR convention? How much of that MLR convention is an auditorium? How much of that is a top? And that would require you to use polygons. So roughly in this case you can see that the dot on the top or the dot on counterculture is a node. The distance from the MLR convention to the nice Phoenix mall is basically a route. So it's a line and then you have an area of the mall that's a map that's a polygons. And that's roughly the technical so that you would want to know if you're doing anything on maps. The second thing that you want to do, which you were just talking about, is to pick a map to show data on, right? We generally, by default, use something like this, which is a very typical bright colors, night's icons, lots of little typical map, right? So this, if you're using map box, this map box streets is similar sort of style that you would get in other tools that you're using as well. But it's always interesting to try and change this map itself. So if I'm going to show data on top of this map, I don't want to see the highway speak to 75 stand out so much. Maybe I don't even want things to be in different colors so that I can use those colors on my map. So suddenly we could do this by choosing a different base map, right? And this is a very small sort of segue which I'm going to do to just get that point across that at times you would want to compare and see how does the underlying map look like and does it help show my data well. So just to kind of throw some points on it and see the difference that on the first map style, which is the streets map style you can start seeing that the data is visible but kind of conflicts there are so many other things that are catching for your attention. In other new map styles, we certainly start focusing your attention on the data that we want and in a sense in any code visualization, your first and primary goal is to make sure that the data is easily visible. So that was the introduction. Just to pick recap you need to think about your base map and about the data before you get stumped in. A quick segue also here on what could be there or what exactly is a map and a visualization and where do they kind of fuse in. So while we don't conventionally call it a visualization, seeing traffic on a map is itself a visualization and that's something that we're already using very regularly but doesn't really feel the same as say conventional visualizations because it just feels part of the map. Suddenly the roads that you already know have a color attached to them and that color shows whether there's high traffic or low traffic. And that's an interesting sort of way to visualize data where it doesn't really stand out so much in your face as something that's been put on top. Another very cool example that I love which might not be very visible here is to see Google Maps and in Google Maps you can see some areas that get highlighted which have a lot of popular places in. So let's just try and see if I can get this here. If you notice this is the 100 feet road and you can start noticing that very subtle yellowish cringe that's been added to the map that makes you realize that the part around the road here has a lot of interesting places whereas the rest of it doesn't. These are very subtle sort of visualization technique that people do to add onto a map which doesn't feel suddenly like oh there's a big circle on a map, there's a big shape for that. And these are super interesting to kind of see. So yeah, this technique is super subtle. You're basically tweaking the layers of a map. This is also a very niche technique and for most of us sitting here we don't really want to do this much. So we want to try to make something very very basic and by basic I mean let's start with the most basic thing. I have a bunch of dots which represent something I want to put them on a map. So let's get started. I'll select a base map in this case I've selected this very light sort of map so it doesn't really stand out too much. And I've added a bunch of things here. These are all the bus stops that are marked on open street map for bandwidth. And you can start seeing this it looks okay. There's too many props. There are places where they're getting overlap and while we're getting all the attention to the data I think it's getting too much of attention. So quickly changing things and playing around with some visual properties like capacity. And this slightly gets a bit funny but also starts highlighting some interesting parts of the data. So you can start seeing some clusters around a particular set of roads which start making this more interesting. Let's take this further. We could use color to try and show some other property of data. So in this case I might want to say out of the bus stops that are marked on open street map which of these seem to have some sort of a wheelchair attitude are the wheelchair friendly are they not friendly is there any data. Suddenly I can start using color to make some of these dots above. And next I can start using size as well. So now suddenly I can add a few more properties and start changing dots to start looking bigger or smaller based on which one is important or which one is the data that I want to stand out the most. And in essence what we just did for folks who've already been doing visualization for a while or have been reading some really good text is essentially just picking visual encoding different variables and applying them in the map style. So for those who aren't used to this, there's a lot of literature in it about what are the different ways in which I could take a particular sort of form on a map and just let data denote different things on it. So you could use color, you could change size you could change direction etc. So feel free to kind of look this up, there are a lot of different people who have given different explanations to it but before we move forward I want to do this one small sort of mention here about how this is separate. So when we have our data for those who already have been using a lot of data sets there's a very clear difference between the data that is continuous so it might be numbers or data that is categorical which could be discrete values. So for example in this case you can see that if your data is say how many buses go on each bus stop that's a number that changes from 0 to maybe 10 or 100 and you might want to use a gradual scale whereas if you have something that is each point being each quality of data or each type of data for example what are the different types of let's say we want to represent different types of facilities available at different bus stops. So there isn't an inherent numerical order in them and there's no inherent quantitative order in them and that's the place where you would want to use the first section where you identify them with colors maybe which have different views and stuff like that. So just like I want to get some of these very basic stuff out of the window so that now we all feel on the same page to get started talking about some real map visualizations so here are some examples one is a fun map there's a bunch of folks who went around asking bunch of folks in America here's a map can you tell me where Ukraine is and I'm not sure how many of you I probably shouldn't have told this before but I wonder how many of us could also mark where Ukraine is so in this case every dot is a guess they've used color to kind of show how close your guess is to actual Ukraine location and they've used opacity to suddenly highlight where people you know where people think Ukraine is versus where it actually is. So it's a fun way to get started with something as simple as putting dots on here's another really interesting case in this case they're showing earthquake data over the years they're using very subtle techniques of time as well as opacity and also this is fun simple thrown in so suddenly there is a triangle which represents volcanic eruptions and then there are these two different colors up which have been used and suddenly they start seeing some really interesting use of something as basic as putting dots on a map and this gets really interesting you can make this interactive and you can start doing some really cool stuff with it you can also create some really controversial pieces of this for example this is a racial segregation in New York and neighborhood areas where each dot is colored based on what is the common sort of race that stays there and these become really interesting to show the detail in the data that you have on a map that we all are so familiar with but there are times when you would want to highlight insights and it's time like these where you would want to use say the size a lot in this case this is a visualization over time where you're showing how many jobs were lost or gained the color is denoting that and the total number is showing the impact you can start seeing in the initial part the bubble burst the dot com bubble that burst and then the recession that came in and this is like a graduated proportional simple map if you enter the technical names but there are these techniques and they very quickly hit a sort of a ball and this is like a very fun thing this is essentially all the workflows that are there in New Jersey and while the intention is really great it doesn't really communicate much and what's happening is that we start plotting individual points on a map there's only so many points that we can put and after a while we really hit a limit as to what data can be shown and like this is a cue for all the folks who are big data or like huge monodata are saying yes this doesn't work maps are useless let's throw them out and let's use big data for something else but let's not get ahead of ourselves so a very easy way of the first obvious solution to this problem is to cluster right so we still have the dots we still show individual dots on a map but if there are too many of them nearby we cluster them and make a bigger dot right and we can change the clustering based on the size the type also add some sample effects like over here you have some color that we added to your highlight mode and this is basically the boom in March that happened a few months back not trying to be political here but this is a great example of what you could do with this so essentially with dots it's really interesting if you keep in mind the different visual encoding principles that you have and it's super important for us to make sure we control the density so suddenly if there are too many of these we want to figure out something else and apart from clustering second something else that we are all very familiar with is coro blitz often called chloro web but it has nothing to do with chloro in it but what we are going to do here is something very simple instead of showing individual dots let's count how many dots appear in an area that we are all very familiar so different states of us you all know how they look like let's count how many are in one and just show the data right so suddenly now it becomes very easy to show how many of the data that you have in shapes that would have been nice so this is one example where you are using light to dark color you cannot denote the sort of density or how much amount is there this is a pretty good chart by Quartz whereas this is another example by microstock where you are using two things very suddenly changing the hue a little bit and mostly changing the brightness right and this is like a third example and in this example versus in this example you can start seeing that there is a change in how the story is being done this is going from light to dark so the darkness really stands out whereas in this case both the edge points really stand out so this is a good example of how you want to use color in a coro plasma and try to see if you want to make a diverging scale whereas there is something that is showing the positive or the negative part of your data and highlighting that much more than the average data or you would want to just use a single sort of scale that shows the edges or the maximum of the data set right so just quickly running through this part but that is essentially what coro place would be right they would give you your familiarity but the only problem here is that the data is always based on the area right so for example in this case or let's take the first case each state is being colored but we still don't know where exactly or how much of that state actually has humans to begin with right and this leads to these really fun visualizations where people say oh the places which tend to have a lot of McDonald's also tend to have a lot of car accidents so you should shut McDonald's so that there are less car accidents but the truth there is that more places with high population will have more McDonald's and more McDonald's have nothing to do with more car accidents it's the same thing population and population so a similar sort of question always comes up when you do coro place right and a quick way to change that is to split the whole map into small things that you would want to call bits so I am going to quickly move through some of these examples just to get the point across about how diverse the map visualizations scene is and where you would want to pick which one to get started with so here is an example of X bit so essentially breaking the map into hexagonal shapes and then counting the number or the point of data or highlighting the data that you want to highlight so this is no longer to dense with too many points it's no longer too abstract or very very high level like coro place it's somewhere in the middle each X bit is equal sized so it gives you a better perspective of how the data is distributed and next thing is really interesting so there is a really lovely journalism piece on scene on Germany and how Germany even though the wall between East and West Germany broke down still very segregated in different sort of attributes but what's really interesting is that they are using these very nice circle pins break Germany into small pieces all of equal size count the data that you want to show and then show it using color and this becomes really interesting to see this like there are segregation sort of battle that they show up so in a sense if you are using pins it's a great idea because it's the best of coro plates as well as clustering or not plots and becomes really interesting to use it to also get some sort of insightful data but let's move forward and go into these really wacky map visualization as well and these are the parts where things get really interesting especially if you are doing data which has some real big impact this is the map of the world and that's interesting because it feels familiar but it has some really weird skewing going on and that's roughly what a cartogram is. A cartogram is going to take a familiar geographical area and going to skew it, move it, tweak it around based on a property that more truly reflects what you want to see so in this case you see population being used as a data to skew the country so no longer do you see Europe standing out as much as it does but India suddenly and China become these huge mammoth blobs on the map here's another sort of example which I really like because it shows this interactively and really gets a point across so if you see the top I'm just switching between different data fields and the map is slowly moving to highlight the sort of view of the world based on that data and this is a very impressive technique it really breaks the sort of biases that you might have which are coming in because of how a map looks like so suddenly that's because there is a huge state doesn't mean that it will have as many people and density of people or your own data properties starts becoming much more useful and easier to reflect and here's another technique so in this technique what we are doing we are taking a geographical area that we are used to seeing and tweaking it around. The second technique on the other hand makes a shape for each of the geographical area represents them in a form that we are very comfortable with and then moves it around so it becomes slightly more abstract this is called the dotlings cartogram this is an example for my course notes protovis time but it's super interesting because it illustrates two things it illustrates how obesity has increased over time in USA and the color kind of gives the density or like how much not density but the percentage of people in each state and how much obese they are so this is going a bit on the abstract side suddenly we are not using the map as we see the map but we are using an abstract representation of the map to start visualizing data and there is another sort of very interesting piece and if you are interested in cartograms in general the best idea is to look at a lot of journalism websites because they deal with these sort of issues the most where you are seeing a shape of a country in some state which you are very used to seeing but you are changing the data to actually reflect the story that you are showing so here is a square cartogram which changes the size based on how many policies are there and it uses color to kind of identify is this is this the state that has said yes is this the state that is thinking about it or is this the state that has said no and here is like another interactive sort of switch between making like a traditional coroplet to making a cartogram and I think this is a good example to highlight so this is UK and you can see London which is way below gets highlighted more prominently and some parts of the map which are generally less populous are reduced in size so this is a fun technique to try out I would love to have conversations later on how to make these or what exactly has been a problem that you have been trying to solve with this but there is another technique and I am just here to talk about different techniques so that we all have different photos to think about the next time we are doing something and this technique is called tile equipments so in the previous one we were changing the shapes or the sizes of different parts of the map based on the data in this one we represent each and every shape in a single equally sized and this becomes more interesting if your data set is something that is equally important for each state so it does not depend on population it definitely does not depend on geographical area but it just depends on the number of each state so if we suddenly start talking for example how much percentage of budget goes in each state in different countries like USA or India you do not care whether the state is huge or not you are first interested in seeing the percentage maybe you want to do a per capita and it removes the population at all and gives you a better picture so in case like this a tile grid map becomes really interesting and becomes even more interesting if you use something called small multiples so folks are very used to data visualizations small multiples is a very familiar term make a chart repeat it and change the data to show seasonality or see the change over time and so tile grids because you have suddenly become very abstract and easy to show in a very small space make for great contenders to do small multiples so this is a great example of showing the similar data of the basicity but now showing it in a single view instead of using a timeline to move through it so that was cartogram cartogram slightly complicated to create you can use tools which are say GIS specific or you can write a bit of code and then try and create this but they are much more honest they are much more closer to the data that you want to show and the last few that I want to quickly talk about and then maybe I know some people are also getting hungry and there have been three data whiskers which have been heating up all your brain so I would quickly go through these not take too much time on this so heat maps very familiar we have all seen say weathered being shown as heat maps or different sort of variables so I am going to show like a I am going to show the snapchat example some of you have snapchat seen this but if you have not seen this what this is essentially doing is finding clusters where people are all posting photographs of snapchat publicly and showing a heat map so suddenly a heat map is no longer just a data visualization technique it's also a great technique to nudge people explore different parts of the data and this becomes really interesting in some daily interactive use cases one of them is snapchat so essentially if you are making heat maps it's going to be great if you are going to do like a high level overview and the only caveat there is to choose your colors wisely so in this case you can notice that the color palette goes from a light color palette to a dark color palette even though it's still very rain bush and that's just something that we can again talk a bit more about in the OTR session if you want to talk about or maybe catch me up on how do you pick colors well because colors itself even though we think red, green and blue feel like a natural progression have different hue and different sort of brightness and saturation that goes around the same as to see how you would want to use this and last two very easy examples also to talk about isoplets and it's a very scary name and they have like many more variations like exocrone and isopart etc etc but they're essentially using lines to mark data as boundaries right and here is like a very easy to see example so from MLR connection in Bangalore where can I go to different parts of Bangalore over different amount of time right so while we're very used to seeing point to point data temporarily saying from place A to place B the more time it will take suddenly we start putting temporal or like time related data on top of a map and start seeing huh this is interesting there's some part of an infrastructure that's really limiting people to go into one direction and maybe that's a great example for somebody in infrastructure to understand or maybe it's just easier for us to understand hey next time we are hosting fifth elephant what is the most accessible place which reaches to most people in Bangalore so isocrones are super interesting slightly complicated to make because they're essentially going to use a lot of data in this case you would want to use some sort of a direction or navigation data set that's going to tell you okay this to this oops so this to this is going to be this much time but it's super interesting to get the insights out and I know now is going to be like a better trigger for people to slowly move out but before that let's quickly finish two more things so flow maps have been one of the more commonly used map techniques back in the days before you had interactivity before you had all of these fancy tools that we have and this is one of the many interesting flow maps that you can see these become much more interesting when you have connectivity or connections in your data this today is slightly more complex to represent and you need to mold your data to get easily like makes us maps they're also very interesting to tell stories so some of my favorite stories just seen in journalism have come out with people using flow or showing flow from one place to another to tell inside so in this case for example if you can see the arrows this shows where people of as students from each state in USA go to another state just attend public schools and that makes this very interesting sort of conversation that none of the other sort of map visualization techniques would be able to convey so successfully or easily so to like wrap this up flow maps essentially connections they're really useful to do story telling and build narratives around and last and not the least because this is something that people love and hate is 3D so gonna just wrap this up by talking about two very interesting examples of what you would do when you have 3D stuff on a map so this is the first example this is Vancouver and the government there is trying to see how is our area basically divided in different zone based on different types of usage of the building area but it's interesting here to see the data not just in a 2D view but also in a 3D view because you get like a better perspective of okay this is actually how city is also these areas have these really small really low height buildings whereas that's the main downtown area and the downtown area has been dedicated to be kind of zone in to have these sort of types of buildings right so this is an interesting use case where you would want to use 3D you want to show the real world as close to possible and then you would want to use a common techniques in this case essentially a sort of thorough plan you may but you just use a 3D element to make it interesting another reason to use 3D is to actually make good use of the third dimension so suddenly instead of just showing a two dimension map which shows one data property you can start using two dimension on the map so there will be say color in this case there is also height and there is also volume right so you suddenly start using 3D as a nice a venue show more sort of insightful data queries that you might not be able to make if you are seeing a 2D map so in this case the bar's height shows the population density in each of the small building but because it's 3D representation you can also see how much volume is there which essentially is the number of people staying there right so suddenly you can start making a difference between okay this part is really thin but really tall and that means that a lot of people are trying to stay in this really small area and that stands out a lot if you notice some parts in slightly south of San Francisco you'll notice these are anomalies and these are some of the good examples or use cases where you would want to try and use 3D in all other cases 3D is generally a gimmick and as many other database guys would say before you are using 3D think you really need to use 3D right so that's I know this is like a very heavy load of suddenly like a textbookish or like oh my god 10 different ways to visualize data on a map but the idea or the goal being hey this other things that you could do and before we wrap up I just want to mention very quickly the tools that you would want to use I am a designer so I don't write as much code as I would like to but I know a lot of people here love to do that so this is something that we can have a longer discussion in the OTR session but just to very quickly summarize there are interactive tools there are tools that you would not want to write code in so there's tabloo which is really great drag and drop and you get your data loaded and it's up and ready for you to start visualizing there is mapbox studio which is what I have been using to make all these custom maps will give you much more control over each and every element of the map and then you can overlay data on top of it and then there's also other tools like carto which also do a lot of interesting sort of data visualization by drag drop or very seamlessly making map styles then there are these slightly more things maybe things that I think we are more familiar with so familiar name being B3 mapbox has a web gil that's graphical library mapbox gil is also uber the guys are outside so you should probably chat with them and there's deck gil and then there are these GIS tools so so I want to wrap up with this very interesting and like slightly tongue in cheek code which says that nothing is certain in this life but access and request for geographic data to be put on a map so with that that's me the Sagar thank you so much so we have time for questions so we could take some questions now or otherwise you can all come over for the OTR session where Anand, Amit and I will be hanging around as well any questions or do we all just follow the crowd everybody wants a tea break we'll be back in here at 11.50 and at noon we have an OTR on financial data analysis