 Hello, everyone. It's a pleasure to be here. My name is Philip Heymans. I'm an advanced data scientist at Growth Acceleration Partners. And today, I'm going to give you a rundown on how you can use the Chronicle R package to easily create R-martdown reports. So Chronicle aims to be an opinionated assistant to whom you can delegate the task of creating R-martdown report, leaving to you only the task of specifying what you want to have in your report. Here we can see a small example of this. We are creating a report with a table, a ring cloud plot, some text, and a box plot. And you can see that the parameters that the functions ask are very straightforward. There are several other parameters for each of these functions. However, they have very considerate default values to avoid the need for user specification. Once you are satisfied with your report, you can just call the render report function. And we can go ahead and take a look at the output. So here you can see you have the table specified with its title, then the ring cloud plot with the title being automatically generated from your specifications, then the text, and finally the box plot. So all plots by default are plot plots, which I like because they let you follow along your discussions in regards to the data in an interactive manner. So this is the entire list of elements currently supported by Chronicle. You can see most of them are plots, but also you can send a raw code and decide whether or not you want to evaluate it or just show it. You can add images, plain text that renders as our markdown, tables, both static and the data table HTML widgets. And as I said, a lot of plots. So by now, you're probably wondering what's happening behind the scenes. How does Chronicle create your our markdown reports? And if you print the report that we defined here, you can see that it's literally writing an our markdown report for you. So you have one chunk for every element you have added. Well, of course, the text is not on our chunk. But here you can see the four elements that we added are explicitly written. And if you take a closer look at the chunks for the rain cloud and the box plot, you can see that they are calling the make rain cloud and make box plot functions. And those are part of the make family of functions. So these are the ones that actually do the heavy lifting of building the plot that you have just requested according to your specifications. This can also be called independently, for example, for a presentation, wherever a ggplot or an html widget would make sense. Please do not take this as an invitation to avoid learning ggplot. ggplot is a wonderful tool with a beautiful paradigm. And in my opinion, is a key part of your journey as your art practitioner. A small disclaimer there. Then once we have covered the part of the content of the report, we can dive a little bit deeper regarding the rendering process. So the render report function calls the function render from the R markdown package. And if you're not familiar with calling functions to render R markdown files, you are probably unaware that there is this environment parameter. So once when you click at the knit button, when you're viewing an R markdown file, you're doing it from scratch in an empty environment. However, for Chronicle, it makes sense to have visibility of your entire global environment. This is, of course, a double-edged sword, which means that if you are messy, you will get messy reports with half defined things and the sort. But if you run the loading of your data, your processing, your modeling, and you get your results, you can use this to create several different results reports for different audiences in a single call with only having to process the data once. So here, for example, we have the report. We want a report for our director. Our director loves using the commenting tool in Microsoft Word. So let's give her a Word document and we can even specify that it should be using our institutional template to keep things consistent. Then we might have another niche report for another specialist team that is not concerned with most of what we do, but they do have interest in a particular analysis we make, so we can send that to them. And finally, we can have our own internal report here using Fodly to explore the data further makes sense. So we could use these RMD formats, which are the output formats that I showed you previously and who are the defaults of the Chronicle Render calls. But also those tend to be a bit heavy in a large file size, so perhaps we can also render as PDF for Bookkeeping. And this is the entire list of rendering options the render report function has. It covers the basics, file name, author, date, etc. You can choose the output format as previously shown. And these later two options make sense. If you want to make more sophisticated things that Chronicle currently doesn't handle, it doesn't mean that you can't use Chronicle at all. You can ask the render report to only build the RMD file, but not render it and send it to your directory. So you there can open it up and finish the more complex things that Chronicle currently doesn't handle manually and then render. Then there's table of content stuff very straightforward. The figure width and height, which is the default for every single plot that is not explicitly specified to have a different one. Same with the plot palette. And then there's the plot palette generator, which is just some sort of safeguard to make sure that you don't just run out of colors and have an error in your rendering. So it leverages the veridious packages, which the veridious pallets, which discretize a continuous spectrum to have always enough colors for your plots. And finally, there's the themes that you can specify depending on the output format that you're using, which gives me the opportunity to mention all the output formats that Chronicle can support. So here, those are split between interactive and static reports because some of them do not support using HTML widgets. And to get featured, you do get featured parity with this. So you can have exactly the same content in both static and interactive report. The only thing is that the static ones are, well, a static ggplot instead of plotly translated versions of the ggplots. And to finish off the feature tour, we can go through the report columns function, which is sort of the cherry on top of these packages. So we can just feed it some data. For example, we'll use the Polymer Penguins data set for this example. And here, you can see the output. So it gives you two sections, the data set overview. This is courtesy of the scheme function from the schema package. It gives you an overview of the structure of your data. And then a summary for both your categorical and numerical variables, including a nice little histogram. And then the second section is if you want to go deeper into your data. So here, it gives you an ordered horizontal bar plot for categorical variables. And then a rain cloud plot for your numerical variables. But what happens if there is some key variable that will guide your analysis? For example, the species of the penguins are, I think, your concern the most. So you can call this exact same function, but with this bi-column parameter specifying species. And that will give you this output, which says variable analysis by species. It gives you the same data overview, but then a second data overview split by species. Again, the schema function. And here you can see that the summaries are, you have one summary for each value of your categorical column, and also for the numerical summary. And then the variable plots. You have your first protagonist kind of column here. And then all other plots will be broken down by the specified column. So here you can see the rain cloud plots by species and also the bar plots breaking down by the species variable. And to finish all up, sadly, this is not all roses. So we have a few limitations, mainly coming from using plotly as the plotting package, which admittedly are probably fixable, but I haven't gotten to the perfect way to fix them. So there are compatibility limitations with custom geometries. For example, those rain cloud plots usually will be created with the ggdist function. However, ggplotless function translator does not understand that those gems. So I had to manually build the rain cloud plots. And that keeps the amount of functions, amount of plots supported. Currently low. Then there's the issue of the data sizes, because large data sizes translate to very large file reports, data size, large report size. And to alleviate that, Chronicle has this line in the sun that if your dataset is over 10,000 rows, it's best to have it static, no matter what the output is. So this is an ugly line in the sun, but this is the current solution to avoid getting reports of several gigabytes of data, which are, of course, impossible to open in any browser. And then there's these unsupported formats. So this is the sad part of the presentation, where I admit that the presentation was not created with Chronicle. And that is because the separators and the custom format that sharing and uses is not the same as every other supported format. So there is not an immediate way to render adequately the sharing and reports. The similar thing happens with Flex Dashboard, which has its own custom layouts to contain the data, the plots and the content. So that again is technically renders, but it doesn't do it in a pretty way. And then there's articles. If you're unfamiliar with articles, this is a package of templates for journals for ready to publication. And please don't like, just don't use this package to write academic papers. It won't be an enjoyable process. And finally, anything that is not currently supported, we do have a way to alleviate it, which is through the add code function. So if there's something you want in your report, and Chronicle doesn't do it immediately, just fit it to add code. And there you have it, magically supported. Here you can see just as it's code, no question asked. So this is the Chronicle report. It will be an honor if any of you take it down for a spin. And hopefully it can save you some time and effort. Thanks. Thank you, Phillip. Our next talk is a third talk today is on new displays for the visualization of multivariate data in the tour package by Ursula La. Hey, good morning, everyone from Paris. So I'll be doing my talk live. So I just sharing my screen. So I think you should be seeing that. I'll just arrange some of this Zoom stuff as well. Thank you. All right. So hey, good morning, everyone from Paris. My name is Ursula. I'm with Bokui University, which is based in Vienna. And what I'll be talking about today is some new displays for the visualization of this big multivariate data in the tour package. And I realized that probably the title already has a few things that aren't quite clear. So as I'll go through my talk, I'll explain what exactly I mean with some of those things. And I want to start by explaining what is the tour and what is the tour package. And so maybe many of you haven't heard about the tour before, but the grant tour essentially is a display that allows us to visualize data beyond maybe two or three dimensions. And the way that that works is that we're looking at smoothly interpolated sequences of linear projections of the data. And because those are these kind of smooth interpolations that correspond to just slowly rotating something in the high-dimensional space and looking at it in low-dimensional projections, we can start to understand some of the multivariate aspects of the data. So things like getting intuition about the shape of the distribution. We can see clustering, or maybe we can spot some multivariate outliers. But rather than going more into the technical aspects or the mathematics, I wanted to show you some examples which I think can explain how that works in practice and how it can be useful. So I have three examples here. I'm just going to hit play on the first one on the left. So what you can see here is a grant tour showing a wireframe cube that's four-dimensional hypercube. And you probably realize that what you're seeing at each kind of step of this animation is just a two-dimensional projection. But you can also see that every time I'm just rotating the view a little bit, so it makes sense what I'm seeing now with what I've seen before. And as we're looking at the animation, we can start to understand how this four-dimensional object actually looks like. I'm going to stop this one here. So that works nicely with geometric shapes like this hypercube, but also works more generally with distributions. So the second example that I have here in the middle, that's a posterior sample in five dimensions. That's kind of a short animation here. But you start to see that actually those points fall on some curved surface within that five-dimensional space. So there is probably some low-dimensional representation of the data that we could find. And then the last example that I wanted to show you is on grouping. So here's a six-dimensional data set. And maybe this is slightly unusual type of grouping where actually all three groups that are shown here in different colors pass through the same mean. But through the animation, we can start to see that they're actually extending in different directions in this six-dimensional space. Something that is really hard to capture with just one linear projection. And so I'll stop my intro to the tour here and just kind of summarize in that we've seen that we can understand some multivariate features in the data. And I haven't really emphasized this, but each view is actually just a linear projection, which has certain advantages. And the big advantage in my opinion is that it's really straightforward to interpret in terms of the original parameters. Kind of a drawback or a limitation with that is that once we have large data, the typical displays and kind of the scatter plot displays that I've shown you in the previous slide start to not work that well. And so there's two different reasons or different types of large data that would lead to this type of problem. So one way of having large data would be if we have a lot of observations or having a lot of rows in our data frame. And if you have worked with that type of data before, you are probably aware that you start to overplot a lot of points and you might miss certain features. And this is especially true if you have what I'm calling here concave features. So you could think of just some hollowness inside a distribution that tends to get hidden in a projection. The other way of having large data would be if you have a large number of variables. So thinking about having a lot of columns in your data frame. And then you have, again, at some point an over plotting problem, but because even if you have smaller samples, if you have large number of variables, points tend to start to pile up near the center. This is also called the crowding problem. And so with this talk, I wanted to talk you through some of the solutions that we've come up with, what we can do in terms of the displays such that we can still work with tours and linear projections, but better address these type of situations. And to give you kind of the answer straight away. So there's two new displays that we've come up with. The first one is what we're calling the slice tour, where the idea is that we're only highlighting a subset of the points based on some conditioning. And I'll talk a bit more about how that looks like in practice on the next slide, but just to say here that this is good for looking at information more locally, which can help, for example, with seeing these type of concave structures that I've mentioned. And this is something that works really well if you have a large number of observations. And it actually won't work at all if you don't. And the second display I want to briefly describe today is what we're calling the sage tour, which is trying to address the second problem, so that what I was calling the crowding problem by adjusting the resolution, depending on where we are in the projection plane. And just to mention here that I'm saying a large number of variables, but that doesn't have to be like in the hundreds or thousands, even with 10 dimensions, this is already important. And we'll see that when I talk a bit more about this display. And since this is the use our conference, I also wanted to mention the implementation. So in general, tour methods are available in our in the tour package. And we've added essentially these two new display functions in the package. So these are called display slice and display sage. And they are already available in the version that's available on cram. Okay, so I'm going to just briefly talk a bit more about each of these two displays. So first, the slice tour. What's the idea here? So the idea is that with the tour, we already potentially draw projection planes for each view. And the way we're looking at the data is that we're projecting the data and looking at it in the plane. And the idea with the slice tour is that maybe we can have more local information if we're also using in some sense, the information in the orthogonal space. So what we want to do is we want to highlight the points that are close to our projection plane, where we make sure that the projection plane passes through the mean or the center of the data, and then fade out all the other projected points so that we can compare this kind of local view to the overall projected view. This sounds maybe a bit abstract. So there's a diagram here that should illustrate this. So on the left, we have points that are inside a sphere. And you can see that at a certain angle, we have a projection plane that's passing through. And then for each point in my sample, so in this sphere, I'm checking how far it is orthogonally to the projection plane. How far is it away from the plane? And what you can see here is that I'm highlighting everything that's within a certain distance from the plane. So this is this H, which we could call the slice radius. And then everything else gets grayed out. And so I could compare the points that are captured in the slice to everything else in the projection. And I don't have a lot of time to show you examples. I picked one that I think is very illustrative for the method, which is just looking at some geometric shapes or kind of curved surfaces in three or four dimensions. So what I have here on the left is the slice tour of a 3D sphere where actually I'm doing something slightly different from what I was describing, because now you might have noticed that the plane isn't actually passing through the center of the sphere, but I've shifted it to be a bit off-center. So as it's rotating, as I'm changing my viewing angle, you'll see that the shape of the points that are captured in the slice is quite different. And you can really imagine that in terms of how you're cutting through a 3D sphere. Then I also wanted to include a higher dimensional example. So this is a torus that's embedded in a 4D space. Again, we can see that the slices show very different kind of information from all these kind of faded out small points in the projection that you can see kind of in the background. And then the final example on the right here, that's again in 3D, that's a Roman surface, which I just think it's really pretty to look at with the slice tour. Okay, and I think I have a bit more time. So I wanted to also introduce the second new display, the sage tour, where as I was mentioning earlier, we're trying to address this crowding problem. So maybe you want to start from understanding that problem a bit better. And really at the height of it, I think is that the way that volume from a high dimensional space gets projected onto low dimensions is really kind of centered near the mean or near the center of the projection. And the way we were thinking about that is in terms of hyperspheres in P dimensions. So P is the dimensionality of the space. And what you can see in these two plots here on the bottom. So let's first look at the one on the right. So that's the volume in P dimensions. And how much of the volume is captured within a certain fraction of the radius. So the fraction of the radius is on the x-axis and then the relative volume on the y-axis. And we see how that looks like for 3, 10 and 100 dimensions. And what you notice is that as we're increasing P as we're increasing dimensionality, a lot of the volume starts to get pushed out towards the maximum radius. And with P equals 100, that's already really extreme. But now what is interesting is that actually the opposite thing starts to happen once we're projecting from this high dimensional space. And so what you see is that once we're projecting and I'm looking at where in the plane, the volume gets projected onto and then again, parameterizing this with this fraction of the radius. So now that's the radius in the plane, you start to see that the opposite thing happens and a lot of the projected volume gets pushed towards the center. So with P equals 100, we have most of the volume being projected in the kind of the first quarter in terms of the radius. And so the idea with the sage display is that we want to correct for this difference and we want to make sure that equal volume in the high dimensional space gets projected onto equal area in the two-dimensional plane. And the way we're doing that is with a radial transformation that depends, of course, on P, so on the number of dimensions. A nice way to illustrate this is to look at what happens to originally equidistant circles. So what I've drawn here on the very left, I just, I think those are 10 equidistant circles in two dimensions. And then I'm applying my radial transformation according to P, so the number of dimensions that I'm presumably projecting from. What you'll see is that nothing much happens with three dimensions. You start to see things get pushed out a bit towards the outer edge. But as we're increasing to 10 or 100 dimensions, this gets far more radical and you start to see that we're giving much more weight to the more inner region in terms of the projection. And especially with P equals to 100, the first two circles basically take up most of the drawing space. And again, just one quick example that illustrates the method a bit better. So this is this maybe infamous pollen data where there's a really small feature hidden near the center of the distribution. And if we look at that with a standard tool, we don't really see all that much, but using the sage display. And so there's just two different ways of tuning the display that I'm showing you here. You see that a lot of the points start to get pushed out towards the maximum radius. And we can really decipher the word that has been hidden near the center for us. And I'll leave it to you to read that off. And I'll use my maybe last minute to summarize some of the talk. So I've just briefly introduced this new displays, the sage and the slice display that we've implemented in the Torah package. And we've seen that we can use that to see maybe convex shapes, see small features near the center. And I haven't had time to really show you a kind of a bigger example. But what we found is that both of those displays can be really useful if we're trying to understand grouping in high dimensions. There's actually a small example in the backup slides if you have time to look through in terms of the implementation. So if you're familiar with the Torah package, you'll find that there's a new display and a new animate function for each of those new displays. I've also mentioned briefly that there's some kind of tuning parameters to this display. So I think it would be useful to have an interactive interface for doing that, especially because it's really fast to generate. So it would be easy to play around with and find the optimal parameters that way. And a final note, since we have defined slicing in this way that's just based on projection planes, we have also used that definition to define what we're calling section pursuit. So that's an analogy to projection pursuit. For those of you who know what that is, so that's essentially a way of finding interesting slices in the data with what we're calling a guided section tour. And to finish up, I just wanted to thank all of you for listening today. And special thanks to my collaborators, Professor Dinecook and Dr. Stuart Lee. Thank you Ursula. Thank you that was a wonderful talk. Now we have a lot of questions. We do have time for questions. The first question for you is what is the difference between the slice store and a PCA plot? Okay, so I think it's more of a question already between the PCA plot or maybe a by plot and the tour. So with the PCA, you're essentially you're maybe you're trying to reduce down to two dimensions so that you can draw a static plot of the data where you're making a selection of, well, the most interesting projection would be the one with the maximum variance in the data, whereas the tour would be just showing your randomly selected projections, but it's trying to show you all the space and show you all the different directions to look at. So PCA is good if you know you're interested in maximum variance and you just want to have a static plot, whereas the tour would allow you to get a better, like a more global overview of your data. And then there's kind of things in between. So you could look at what I mentioned just in the end, what's called the guided tour is again, trying to optimize some index function to projection that's more interesting than just randomly picking any. Thank you. Thank you. We also have another question. Do you usually standardize your variables before you plot them? Yes. So whenever you're doing projections that are not just axis parallel, so you're looking at combinations of variables, standardizing is super important because otherwise you're just not giving equal optical weight to the different variables. And I would say it's even more important with the slicer that I was introducing because we're super sensitive to this thickness, to this parameter that's cutting off what's inside the slice or what's outside the slice. And so if you're not careful and you're not standardizing your variables, you will probably miss something if it's on a different scale. Also another question by Jonathan. How does the computational speed scale with the size of the data set? And what is the largest data set you have used? And could you suggest some paper, blog, book on grantors and genders? Yes. Good question. So there's different aspects of that. If you're just running the tour in itself without any optimization, as I was mentioning, then you're essentially just, you're not really limited by the size of the data in terms of running the tour because all the projections are just sampled in terms of projection planes that are not, they don't care about your data, but you're limited when you have to actually draw it. So if you're running it live, just the speed with which you can redraw all those points can be limiting. But one solution that I have used in the past is that I can just record everything. And then if I'm generating a GIF, I just make a bunch of PNG files. And in that sense, you can really work with arbitrary size of data. It starts to become an issue if you want to do some kind of optimization where you're looking for more interesting views of the data. In that case, you have to evaluate an index function on each new view. And that can be really slow if you have a large set of data points. I would say that a couple of thousand points still works well, but I haven't tested anything that goes beyond that. What is the largest data set you have used? So I'm trying to remember because with the slice tour, because we're cutting through, we have had pretty large data sets because once you're increasing dimensionality, you need a lot of points to begin with, such that you're even capturing anything within the slice. And I'm sure there were a couple thousand points in those data sets, but I don't recall the exact numbers. Can you mention some R packages for plotting multidimensional data, that is another question? Yeah, so of course the Tora package is a good place. And another one that I would recommend is the GGL package, which has implementations of parallel coordinate plots and scatter plot matrices, which I think those are the main ones that you should be aware of. And I wanted to come back to the previous question on reading on the tour, because I just wanted to point out that we've recently written a review paper. So if you look for that, that will have some good introductory information. Excellent. Thank you. Your talk was wonderful. Thank you for taking all the questions. Thank you. Can I maybe ask a quick question? Of course. Maybe it's a little bit going towards my talk. So did you try out not plotting in the actual thing in two-dimension, but in three dimensions? Yeah, I believe there is an implementation of that in the Tora package. I'm just not that big a fan of it, because if it's three-dimensional and it's also moving, I think it's just too much information to process for somebody looking at it. But I do think that some people find it actually easier to work with that display. And in that case, I think it's just kind of trying to show depth by showing maybe the size of the points. I don't know exactly. Thanks, everyone. Thank you, Ursula. Thank you. Our full talk today is on plot via Voktria data by Philip Domen. Welcome, Philip. Yes. Thank you. Yeah. You know there's this lag between submitting a talk and giving it. So in the meantime, it was renamed to plot VR, from plot VR to plot AR. You will see a little bit maybe why. Yes. So let's go. Maybe just, I need to share my screen. Please do. That's, yeah. So I think you can see it now. Yes, they can see a screen, Philip. Just one moment. I'd like to have the chat Q and A open in the second window. So now I'm organized. Okay. So thanks. As I was saying, it's now called plot AR, plot VR. Now I'm completely confused. Plot AR it's now called. So maybe quickly about myself. Okay, I need to shift this away as well. I hold a PhD in mathematics. I'm now a managing consultant at the one solution in Zurich. I'm doing mostly projects in data science, machine learning, and as well visualization. So that's maybe a little bit connected to what I will be talking about today. But actually what plot AR is, is completely my free time. So I'm doing it on weekends and off time. So it's sometimes a little bit slower in development right now. Yeah, there are a couple of other projects. If you want to, you can look into those quickly about D one. We are maybe one of the most talented data teams in Switzerland. We do all of the data driven value creation starting with bringing data into a data warehouse, working on data experience like working in dashboards with power BI or Tableau or doing specialized in D three. Then also where most of my work is in machine learning AI, we are over 90 consultants now and are still hiring. So if you're interested, reach out to me or if you're interested in our professional services and contact us. Okay. So we have heard a lot about how we all love to visualize our things and we produce, we put much effort into producing really cool visualizations. But now and then we all see that there's this gap. So that it kind of doesn't fit maybe into these two dimensions that we are all used to because of our displays. And actually there's a mathematical theory behind that. So in probability, you know that the random walk in two dimensions is recurrent. So there's not enough room to go to cover lots of spaces. You always end up again in the same place. It's kind of too small. And the third dimension, you might think it's just 50% more. But actually in three dimensions, random walk is transient. So if you walk around, you will eventually leave that part of where you started and go into different parts. So there's a fundamental difference between two dimensions and three dimensions. And yeah, so we all try to put them with the standard packages, do 3D plots. But we as my previous talker just said, it's kind of, it's always a barrier. So in the end, you have just the 2D, an interactive 2D projection of your 3D data. And it's okay maybe for some people or for some things to look into it, but it doesn't feel like 3D. So if you are happy enough, you might have access to a 3D monitor. So there you really see a 3D impression of your models. I think that's mostly used in molecules, design, and in engineering maybe. But as data scientists, probably our employers won't think that's a good thing to spend the money on. So is there a way to get a feeling of the 3D environment using stuff that you have flying around? And when five years ago, the Google announced the cardboard project, I immediately thought, hey, that's the thing where I want to go into to look really in a 3D sense into my data. And in the last, I guess, two years, there's a huge investment by all the tech companies into AR. So the idea is now of what AR to put to give you an open source package. You can just start up and you install it and you have the possibility to use the AR in your phone, in your smartphone to walk through your data. So let's see. I'm now trying my luck with the demo gods. It's a little bit more involved demo. So bear with me. You see here, I'm here now on mybinder.org. So you actually can just, I will share with you the link later. You can just go and start this RStudio on the web and it will work for you. You can then open up here the demo.r and that's what we'll start with. So you load the necessary libraries. You don't need to start the server actually here on mybinder because it already is running behind. And then you just take, yeah, let's start with iris. You paste it in your plot AR function and say the color should come from this thesis column. So let's see. Oh, what was that? I don't know. There was maybe a thunder here. So after I have opened, issued that, okay, that was close. I hope it's a good sign. So yeah, it opens up here in the viewer this part with, it's called keyboard. So you immediately see here a 2D projection of your 3D model. But that's not the thing that we want to look into. So what we actually want to do is we want to see this now in the AR of our iPhone. So I'm using here an iPhone. It should also work on Android. So in order that you see what happens here, I have this this sharing of my phone screen. And actually I'm just using the standard camera of iOS. On Android, you actually need to install a QR code reader for that. But since a couple of years that's now there's a now a standard possibility here, you go on to your screen, you immediately see the pop up and then you are there. So no installation needed. Oops. Okay. Yes, back there. The only difference is that you see here now in the viewer, a small icon. So I'm going to tap on that. And I'm immediately here in an AR session, I need to find a plane and I'm now here walking through iOS. So maybe I put it on a larger part here. So you really see now you can walk through your data and you have the 3D image impression because you actually can go to the data points. If it doesn't fit you, you can shift it around by just tapping and dragging or you can pinch to zoom. So to make it, for instance, smaller, you have a better overview. You really can go here into the data. You see the axis, what they are and so on. You're done. Oh, it lost the feed again. Okay. I hope you got that mostly. Okay. Sorry, sorry. I will show it again. So like that, you can shift around. You can pinch and zoom like that. You can walk through the data like that. And you have really the 3D impression and it's not a translation of your data of your mouse movement into a projection. You don't understand. It's something you are used to. You can just go to the data as you would in a real setting. So you have the impression of a real data. So screen change has come again. So now you can even go and that's it. Say the size should be the petal length I think was missing. Let's see. No, the petal width was missing. So reloaded. And here now you have really all the four dimensions of iris visible and you can go through. I will now maybe question. I think I see here already more keyboards and devices going on. So I guess people are scanning this QR code and going sharing my session here. So yeah, when we did internally this talk, then many people also started and had their video showing. So you could see how they are all now moving around with their phone. So that's why I'm now actually going to my local installation. And for this part, so you can also, for instance, say I want to have some text in my data here. I only have like the species. So I will do that like this. So it needs to load. Actually, I can again go to my QR code and open it up. So you see it figures out what is the correct IP address that it should connect to. It shouldn't just connect to local hosts because it's a different device. So now I can again look it up here. And I have the text. This is, for instance, I'm using a lot with sentence embeddings or document embeddings, word embeddings. I'm using those plotting, using T's need to go down to three dimensions. And then I have more impression than if I would have gone to two dimensions. I have also an iOS app I have been working on. So here again, you have directly a QR code button. And now I'm connected to this session. I can open it up. So immediately I have here my thing. And now sometimes in this, for the, the iOS app is interesting because it gives you the possibility to have a better connection between your, your device and what you're doing. So for instance, I have here the possibility of this keyboard to move my data forward. So it moves forward on my phone or backward. I can push it up. That's obviously the wrong way. But yeah, it's still in development. You can even use here your keyboard to be like in a first person shooter. Just hit W or S and so on. And yeah, you are there. Okay. So that's this example. I think I don't have too much time anymore. So if you want to, you can go through the other demos I have here, the gap reader, for instance. So this data, unfortunately, I have not no animation yet, but that's, I will come to you soon. Oops, where did it go? And yes, no, you cannot see it anymore. It's too big. That's why I didn't see it. Well, I need to make it smaller. That's weird. That didn't happen before. Yeah. So there are still some things now and then. But I think you get the impression of what you can do. You can even do the Brownian motion that I talked to you about. So you know that in two dimension, it looks like something like that. But in three dimension, you can. I have a good connection here. Okay, I will skip that now, but please use it. There's even the possibility to draw surfaces, which is really nice like that. Because you then really can go through your data. Okay, demo goals were not extremely willing, but I hope you got the gist of it. So if you want to go further, you can always go on to this GitHub page here, or you can start your own session on my binder using this link. You will get the same thing that I have shown you. Right now it's not yet on cram. It's not that ready. I feel the R package, but it will be in the next couple of months for sure, maybe weeks. It's also you can install it in Python. The iOS app you can also get from GitHub from my account there. There you need the X code. And in order to go to your iPhone, you need a free personal team. Or you can write me and become an early tester just for the app. I had also at one point an Android version. It's now outdated. I hope that I can invest more time into it. So maybe a little bit background. How does this communication work? So you have in the center a server. This is actually right now implemented in Python tornado, which is a Web socket capable server. From the desktop, you use basically the libraries use HTTP posts to post the data. And they have the viewer, which is actually an HTML document that is just downloaded as well from the server. On the other hand, the mobile devices, they either go directly for this data JSON. You see here the format I have. So this is not yet fixed. I guess I will work on this protocol here still. Or we have the possibility to render these plots not on the device in the app, but on the server into formats that are now supported by browsers. That's, for instance, the USD format for iOS. It's the GLTF for Android. Actually, under my binder setting, this is actually a Jupyter server that's running there by default that's then forwarding to our studio. So in the Jupyter setting, I'm actually embedding this server as part of this tornado that Jupyter is running. And I'm handing the tokens. So if you go back, you see here, the QR code is much smaller because the URL is small. Here the QR code is much more defined because in the URL here, you have actually the token as well. Okay. So yeah, most of the features I think you have seen. I don't want to talk about those anymore. Maybe a little bit the outlook of the vision that I have there. So yeah, all of these formats that I'm now using, they are used for animation. So in Pixar, Pixar you know is animation. So it shouldn't be too big of a problem to actually render animations in this setting. It's more an organizational problem there. So how do you think about the protocol and stuff like that? Another thing I think there was a question about the tooltip. Yeah, I would like to have something like that you can tap on a button. Maybe even there will be some recognition that you grab something in the AR vision. But I guess for the starters, you can tap on something and then it can pop up either a tooltip next to the data point or it sends this information back to your desktop. So you actually have the information on the desktop. Both things are really possible. Then there's some really amazing thing I'm thinking about. You can add that's a no brainer, different scatter point symbols like not only the sphere but also a cross or say an arrow pointing in the direction. And then you can add as two more features the rotation of this scatter point symbol. If it's an arrow, that means you can plot vector fields and you can really walk through the vector field. Yeah, I want to put it on crown soon enough. Another thing is to these files that are now produced. They can actually be just saved and you see on this examples page on my GitHub account examples of that. So here you have just a couple of files lying below, which means that you actually can here it's embedded so you can drag it around or you can open it up on the platform of your choice. So have a better possibility for that. And last but not least, now I'm just producing single plots. I really would like to have multiple plots in my setting and which becomes then kind of an AR dashboard. And yeah, the funny thing is I actually started a little bit working on the on the architecture slide in AR so that I could have shown it to you like in AR, but I guess that's for next year, something like that. So thanks a lot. I'm interested in your questions. Thanks. Thank you, Phillip. It's really interesting. Absolutely novel. Absolutely novel talk. Thank you so much. We have lots of questions. The first one for you would be can plot AR plot text and images as scatterpoint instead of circle dots. Yes, it's so it can text I showed you. So basically, if you have a text column, you can specify that. There is, for instance, this example here where where I plot our my teammates at D1 with some very simplistic feature generations and you see all the texts are different. Actually, for the GLTF format, these texts are images, in fact. So I rendered these texts on the server into images and can only embed those into the GLTF format. The USD format is in that way maybe nicer because you can actually just put it there and it's always looks crisp. And the downside of that is so obviously here I could also enter images for the scatterpoints. I had a little bit difficulty yet on in the USD format doing images, but I think that's that's just, yeah, it's a little bit involved these formats and bringing them together shouldn't be that big of a problem. So definitely that's something. So I want to have next to the names here also the muck shots, you know. Now I cannot hear you. We have another question. What does the toggle flying mode do? Yeah, so the idea there was a little bit mostly in the VR setting that you can start flying and then just by looking in the direction you want to fly, you get there. And there you could then also increase speed. In AR it's not, it's better if so the difference there maybe is in AR you can use your fingers. There you don't need to fly through it. In VR in this cardboard setting you only the only interaction you have on the device environment because it's now in a cardboard is that you can kind of click. So you have a kind of one button. So there it was nice to be able to fly through and then also use the keyboard next to as with the second hand to navigate through. Okay. One more question. Are we able to have tooltips display of the figure when we walk through the data on the phone? Yeah, that's that's for sure one one thing that I want to to do. But not yet. So in USD in the USD format that this that should be possible easily. In the GLT format it's not possible inside of the format but I think actually what I'm using here is is a model viewer HTML library by Google I think. There I think there's also the possibility of having interaction and I'm not sure whether that then translates well into into Android but I think it should. Yeah, so that's definitely one one thing I want to have there. Okay. Thank you so much Philip. Thank you so much. Looking forward to using this and this completely mirror. Thank you so much. We have a question for Philip Heyman Smith. Philip. I'm here. Yes. Hi. We do have a couple of questions that came in the Slack channel. One is what do we do if the standalone interactive output report HTML file size is large too many pictures? Yeah, so basically the only hard stop that Chronicle has is in case that one single plot is pointing to a very large dataset. Computing or predicting file size from plotly outputs it's very hard. So that's an open question that if anyone has any suggestions I would greatly appreciate it but as of now it will not warn you it will just hang making gigantic reports that probably no browser can support currently. Thank you. One more question. Can the Vega function be used besides Gigi plot 2? Can the Vega function be used in plotly plots besides Gigi plot 2? I think that's one is that one for me? I believe so. I have not experimented with Vega at all so I would like to go back to that. Okay. Great. Thank you so much. Thank you Philip. And we I want to thank all four speakers for today for your talks and up next is a 15 minute break where we can all hang out and then hash lobby channel on Slack and after that at 7.15 AM UTC is the keynote talk research software engineers in academia by HEDC bold with it. It has its own Slack channel at hash KC bold and we'll wrap up now. I want to thank our speakers again our sponsors asked to do and Roshay and my co-host Adrian mega as well. Thank you so much everybody.