 Thank you very much for the kind introduction. I'm talking about geospatial analysis, and the second thing you see, the title is Jupyter Hub. Anyone of you already used Jupyter Hub before? Oh, cool, so not too many, but more and more and more, that's great. Actually I could have titled that talk Jupyter Lab instead of Jupyter Hub, but I want to show you how cool Jupyter Hub is. It's basically Jupyter Lab where you can log in. So if you go on the website and you have a Jupyter Hub installed, you just get this page here and you can sign in, and after signing in you see this, so it's perfect. So it's a multi-user Jupyter Lab and that's pretty much all. Here is something I like to show you too, this one you can do with a regular Jupyter notebook or Jupyter Lab installation too. You see I have three kernels here. I have a Python kernel, I have a Markdown kernel, I have R kernel, but there is another feature. You can have kernels with different Python versions, and that's quite handy. And you just create a virtual environment. You see that above, using Condi in our case, and environment name, whatever you like, and then you specify Python 3.5, 3.6, 3.7, whatever, don't use two, and the IPython kernel. And then you activate this environment and install all your cool packages you want to use, and after that you can create a new kernel with the line above. And then you can list all the kernels using Jupyter kernel backlist, and you see actually all the kernels installed. So if you make this procedure for, let me say, five different Python versions, you will see actually five different Python versions in your Jupyter Lab environment. And that's really quite handy, and if, and now we come back to the original title, geospatial, if you install geospatial modules, then you usually have to install many C-based libraries. And for that, it's really, really recommended to have multiple Python versions and environment. And of course, if you are on Jupyter Hub, you will have your file system there, and you can access all your user files from the Jupyter Lab or Hub. So what we are doing, we have an HP Apollo 6,500 server, and on this server we installed Jupyter Hub, and we bought this machine with 48 cores, 192 gigs of RAM, and attached it to our small storage system with 120 terabytes, which is actually quite fast storage, where we have one GB per second reading and writing speed. And that's also a very important fact, if you have terabytes of geodata, you want to have a really fast and reliable system. We also have four NVIDIA Tesla V100 in it. Wow, that's high-tech here. So I think the cable should be changed tomorrow. Okay, so what I wanted to say, we have NVIDIA Tesla V100, the SXM2 model, that's here, one of them, and uses lots of power and has 900 GB bandwidth, so it's quite fast. That we use to create our deep learning models, more about that maybe later. So what is geodata? There are some standards, ESO standards describing what's geodata, the technical commission 211 series, and so on. But the most important is most data you have has a geospatial component. Most data you actually have has a location component, or you can create a location component out of it. And mostly people use GIS software to load and manage this data. However, that's something I do not want to do personally. I use Python for that. So what I show you now is everything I'm doing with geodata is done in a Jupyter notebook, and you can really uninstall all GIS software if you do that. And today I'm limiting myself to vector data and a little bit raster data. There is also geospatial data like point clouds and 3D objects, and that's not what I'm going to tell you. So everything is open source, I'm showing today. The most important to libraries are C++ based. It's GDAL-OGR, okay. And the second library is Geos. And they have bindings in Python, and it's really, it's not PySonic. So therefore, some people created new Python modules which are really PySonic and use the same C++ library, and it's much, much nicer to work with that. I would not recommend using GDAL directory. I would use rasteria for raster data processing, phiona for vector processing, and shapely to do some vector data operations. I will show you in an instant. And if you know pandas, a really nice module in Python, there is also geopandas, which extends pandas for geospatial data. So that's, I give you the links, which projects we are looking at today. The most important is that we use Jupyter notebook, and the first module I'm showing you is FOLIUM. FOLIUM is basically leaflet.js JavaScript library to create maps. It's one of many JavaScript libraries to create maps. And with three lines of code, you have a map in your Jupyter notebook or Jupyter lab. So you can specify the important FOLIUM module, and you just create a map, you specify a location, and a zoom level, a zoom level is how far you are away from the ground. There are typically about 20 zoom levels. You know that from other mapping services like Google Maps, Bing Maps, OpenStreetMap, Yahoo Maps, and all these map services that exist today. Another thing is if you look at vector data, there are some specifications like the OJC simple feature access specifications, where geodata and in this case vector data is defined. This is used in many databases like Postkeys, PostgreSQL, and so on. And one of many representations is just using text. So I use text to specify a point. I use text to specify a polygon and so on. The reason for that is you can print it and in 100 years you can still read it. So in the Chico world, that's a very important topic. There's also the WKB, a binary format, but I'm not talking about that now. So here are some examples. If you specify a point in WKT, well-known text, it's just point, brackets 10, 20 in this example. Or if you have a polygon, it would be polygon text to coordinates. Or there are some things like multi-polygons. So you have multiple polygons. For example, if you have a country with islands, there are multiple polygons in that. There are also countries with holes, and then you have a hole. This is all specified in a WKT, so it's a nice thing. And we can use that directly, we can use that directly. So we can create something similar like the WKT, just using a Python list and tuples for the coordinates. And you see you create import polygons, import point. And here you just specify your polygon. And if you look at it, you see the first and the last point is the same. That's an important aspect of this standard. The first and the last point is the same, so we have a closed polygon. We can actually load it from text too. We can create a string with the WKT definition and load this using shapely WKT and just load SS for string. And then we have our polygon definition. Another format which is quite popular in the JavaScript world is GeoJSON. And there you also create your polygons and specify the coordinates. That's another approach to define vector data. Of course there are many other formats too, I'm not going into details there now. But that's what you find if you go into the Geo business. So let's just add such a GeoJSON in volume. You see it's a little bit more complicated. But basically you open the GeoJSON file, you load it and you put it on the map. Again, same syntax. And then you use the GeoJSON from volume, it's just called GeoJSON. And you add it to your map. In this case I loaded GeoJSON of Switzerland. Okay, you see that, it's a shape of Switzerland. And now I do the same again. But I plot it directly using shapely and you see there is a, it's not the same. So there is a distortion. So Switzerland is not distorted usually. And the reason for that is we have different coordinate systems. So, yeah, let me show you the criteria on the sphere. You know that there is longitude, latitude, longitude along the greater latitude for the poles. And you can project this to a map. The easiest way is just to create out of the sphere. You just create a Cartesian coordinate system. So you do map the latitude, longitude on it. And then you get this one and that's a completely distorted image of the world. It's not what you see in Google Maps, actually. There is even more bad with more distortions. So there are some definitions. The Earth is a ellipsoid. So the World Geolithic System, 1984, defined some data how the Earth is best fit in a rotational ellipsoid or spheroid. And out of that, you can create different map projections. I took three here out of many 10,000 different. Actually, you could invent your own map projection if you want to. And here I printed three of them. And you see they're all a little bit different. Mercator projection is what you know from Google Maps, et cetera. And you see the darkness down here is bigger than most other continents, which is completely wrong. It's an effect of projections. So we can look at these so-called coordinate reference systems or spatial reference systems. And we can have two special cases. One is we use geocentric Cartesian system. That's just Cartesian system with XYZ. Or we use projected coordinates. That's usually not 3D. It's actually flat. And actually, every country has its own representation. Switzerland has its Swiss grid. And for example, all the countries they have their special coordinate systems, too. I'm not going into details here. But you can look it up at epsg.io. You can look the system of your country. Epsg is the European Petroleum Survey Group. They catalog all these coordinate systems. And for example, the epsg4326 is the World Geolithic System 1984. OK, that was a little bit off topic. Let's look at the real example. We are located around here. So we can say we have a longitude of 7.5. So here Greenwich is zero. And we are 7 degrees to the east. And then 47 is the latitude. So here will be the equator. So we go 47 up here. And we are in Switzerland at the Congress Center Basel. So that's how it works. Maybe the problem we will see in an instant. So with Shapely, we can do some nice expressions. We can check if a point is inside a polygon, for example. That's a very complex operation. But with Shapely, it's just a few lines of code, actually one line of code. So you create a point, 47.7. That's our coordinate of the Congress Center Basel. I can look at it as WKT representation. I see point and the coordinate. So everything is perfect. And then I check the operation. This Euro-Python point is within Switzerland. And we get the result false. So what did I do wrong? Lower case, wrong projections, all wrong. No, it's very simple. It's very simple, you see. I show you the result, how it is done correctly. So what was the difference? I flipped the latitude-longitude. Now I have the longitude first, and then it works. So the problem is, before we had the volume module, volume size, first latitude, then longitude. Shapely size, first longitude, then latitude. And that's a common problem. Some people say, lot long, long lat, lot long. This is better, no, that is better. And the confusion is perfect. So we have to always consider that and know which module uses which representation. Personally, I prefer this approach, too, because it's something like x-axis first and y-axis second. But in geographic coordinates, you can't say x-axis and y-axis. So that's the point where many people find it's worth disputing. So I said before we have other vector formats. I'm not going into the details. I just recommend, if you want to read vector data, use the FIONA module. But as the time is going on, I'm showing quickly GeoPandas, which is pandas with the ability to make some geographic or geospatial queries. So I can load something. Let me load a data set with all cities of the world with more than 5,000, with a population greater than 5,000. You can download this data set at geonames.org. So it's very small, so you don't see that good. Well, because it has many data in it. So I reduced it to the most important data. So I take the name, latitude, longitude, population. Now you see I take latitude first and longitude. And that's the data set. You can create a GeoPandas out of it. The trick is to make a column is named Geometry. And in this Geometry, you have a shapely representation of the geographic information. This could be a point, like in this case, or a polygon, a multi-polygon, or whatever. You can create your Geometry column just there. So GeoPandas can also plot. Like we know that from pandas, just make your geodata frame and you plot it. And if you plot all cities of the world, you'll see you recognize the shape of the continents, more or less. So Europe is quite green in this case, so there are many cities. So I can do some queries. I can basically pandas, so same. And you see if I make query, name, basal, I get basal information. But the more interesting are spatial queries. So let me get the distance from the Congress Center here to all other cities in this dataset. So I just create our point again and calculate the distance and make a new column, let me miss distance, and I sort this column distance. So I show you the result. It's simple to understand. So you have the name here and the geometry and here the distance. So you see we have Biers-Felden just next to basal and basal itself. So it's a little bit strange because it's always the center. So it's the distance to the coordinates. So we are closer to Biers-Felden than to basal. Binningen, Weilam-Reindersen, Germany, San-Louis-France, and so on. So that's the names of these eras with the distance. So I can also query within a polygon. So I can use my polygon again and say I would like to have all the cities within the polygon Switzerland. And then we see if I do that and combine it with something else, like, for example, I would like to have all the cities with population bigger than 20,000. And I return. This is not sorted, actually, but it doesn't matter. So I get all the cities in this dataset within Switzerland and with a population greater than 20,000. So let's do one more thing. Display the cities in a volume map. That's quite easy. You can combine those modules. So you just create apply, for example. You can specify a function which fills the creates marker of every city. And then you have that in volume. So let me do a last example before session chairs rose me out. There is, for example, a nice data set for this live earthquakes or the earthquakes of the last two weeks. So you can download that directly with this link. I do that, for example, as the request module. And then I store it as a file. Earthquakes, geochasing. I just did it half an hour ago, I brought. And that was the result. So I can use Geopandas to open my geochasing directly and display the first five incidents. And again, I simplify the dataset. I reduce it to four columns. Time, magnitude, place, and geometry. We see the first five. It's not sorted anyway. And we see a trend in California at the moment. There is a hotspot there at the moment. And we can create a histogram out of the data. That's a nice way. Using histograms with 16 bins in this case. We see most, luckily, most earthquakes are around three. And there are higher ones in there, unfortunately. And we can, you see in the first column here, you have a timestamp. And to change this timestamp to a better readable representation, you can use the daytime and the time zone module of Python and create a new column, which is more readable. So we have 10 of July. This is UTC time zone. Maybe we hear something about time zones and the lightning talk. I don't know. Oh, tomorrow. Tomorrow, OK. Very nice talk about time zones. Very important. Miro Slav. So we can plot this. And we can also plot multiple geodata sets. And you see I read this geodata frame. And I can combine this using just plots, multiple plots, using the axis. You can have multiple plots. So I can display the continents and some earthquake on it. You could do more. You could change the size of the dots, depending the magnitude. And I think the cable says it's time for questions. So thank you very much for your attention. And are there any? Yeah, I think that's actually there is a microphone on the table. And I think somewhere, I'm not quite sure. Hello? I'm not sure. Can you say something about what you used this very expensive computer for? Yeah, that's a good question. I unfortunately wanted to say more about that. But I was wrong. I had this time after 35 slides. I said, oh, we have to stop. We do some project, for example, to detect solar panels on roofs. We have a data set of orthophotos of whole Switzerland. That's about two terabytes of data. And there, we try to detect solar panels, different kind of solar panels. And therefore, we create models, deep learning models, and train that. And for training, we use the forward GPUs to improve it. Oh, there. It's confusing, Mr. Microphone. OK. And of course, many other applications. We do many deep learning projects at the moment. No, I didn't skip. I actually didn't even put inside in this presentation. But I don't have it ready, actually, to. Are there any solutions for geodata special queries in databases in Django applications you would recommend? Because we've seen Python now. But if I have to trim it down to SQL, it becomes a bit more complex, especially when I have to do it from a Django direction. Of course, you can. This is something I don't like to answer in a Python conference, but you asked me now. For example, as Postgres can and PostJS. And PostJS uses spatial queries, too. You can do the same like I showed here. And unfortunately, you can do that with PostJS much faster than using the Jupyter lab solution I showed you. So what I showed you is actually slower than if you are using Postgres. But you can do actually the same. The disadvantage, of course, is you don't have a nice Python environment. You can't program it nicely like this. You can just do queries. I'm aware of that. But it's a feature of a specific database. And if I want to do it from Django and the Django query should also work with SQLite, then I think I can't just use the Postgres features. Are you aware of the project, GeoJungle? There is a GeoJungle which takes care of these details. So you can directly access the features of PostJS with GeoJungle. Some possibilities to use these libraries for the planets other than Earth. So Mars or? Yes, it's actually no problem. You can do any planet. The only problem is that you don't have high resolution data of other planets. But it's basically the same. You just need the model. There are models for Mars, for example. There are models for most near planets. On Earth we have the WGS84 representation. But the Mars is basically also ellipsoid. So you can do exactly the same calculations. You could even do distance calculations from one point to the other with GeoPandas and the Mars data set. So it's no problem. OK, thanks. Yes, I will make them. I think all your Python slides will be on the website program. So all speakers will upload them. And you can just download it from the place where the schedule is. So just click on the topic. And you will get the link to the slides. That's a very good question. Don't use GeoPandas for very large data sets. It's the same like Pandas. You can't use Pandas for very large data sets at the moment. The developers are working on that. They try to do some memory voodoo. Sorry for that. But it will not work unless you use modules. I didn't show that because it's already too much in detail. If you use Fiona, for example, you can take actually one row of the data set and you have that in the memory. So you have to do the memory management yourself. For example, if you want to do some distance calculations, you would just do it over a pair of row base. And then you could take a multi-terabyte data set and do your calculations with that. There is also for larger data sets, there is PySpark and GeoPySpark. There is a trend in putting a geo in front of classic Python modules. And with GeoPySpark or PySpark, you can do much bigger calculations. Actually, there is almost no limit. It's a hardware issue. If you have enough money for the hardware, you can have unlimited amounts of data. OK, thank you very much again.