 Thank you everybody. So Well, I am impressed so it is Friday on the conference It is after the coffee break and you actually came as volunteers to talk that is boring Now there may be some confusion the restroom if you want to have a nap is over there It was the one day before at one door before Yeah, why is the boring office talk? So well truth been told a couple of years ago I switched to the dark side So I'm mostly do project management nowadays and there as you know Well, there's a lot of excellent PowerPoint involved and so I'm on my little quest to automate that so I looked for Available modules because there are always well a couple of modules available in Python and in this space there Quite a lot and what I would like to do is first of all well motivate you while boring things can be interesting And then to give you a let's say short introduction and what's available and how it could look like Hoping that you will then start to do your own well boring stuff with these modules and maybe with some insight from this talk So motivation some things should be boring. I like boring So I don't have to think at all about the well Equipment in this and conference center. I don't have to think about the plumbing at all It just works always when I have to think about the plumbing. There's a problem So and I would like to have the same situation for Well office files because they are to a certain extent the plumbing in most organizations that I know they get sent around Somebody looks at them annotates them So well a way to communicate and I want to be make this boring. I don't want to think okay I have to copy and paste a lot of files a lot of images a lot of pictures a lot of PowerPoint slides at Friday 8 p.m. Just before just before I hopefully can go into the weekend That's something that we should do automatically Which allows me to introduce my sweet characters Totally um well archetypes so we have the boss the boss as always wants an excel file hopefully with nice numbers, but Well in doubt an excel file with the right numbers have will have to do then we have Well this little guy here So he's working in office So he's the only one with a tie here and he mostly works Copy pasting combining do an excel files. So we'll learn what he does later and he has one co-worker and His co-worker is a Python knowing pragmatist. So maybe he studied physics learned about Python now started a job and While he knows the all the files that has to be produced. She knows how to automate it Well, and he knows that he wants the files And again, what do I hope to accomplish with this talk? I'll have to warn you the code on these slides if you compare with what you've seen as conference It's not pretty it's API code and it's API code that is pretty closely aligned to how the file formats work so you will see lots of calls that well Enter some data menu play on data do highlighting. This is not something that is really exciting It's just useful. Well as a talk applied is boring. So What you should get out of this talk is module names There are a lot of different modules that are available and a rough understanding how much effort you would need to invest to say Haven't nicely highlighted excel file have a PowerPoint with chart have a PDF file that looks like you're on PowerPoint file and With that you can maybe just convince People at your workplace to let you play with pundas because you can do calculations in a repeatable way and then create some Let's say business conformed files from that that your boss can then take and add it around So ultimately what I want you to see and to do is when you maybe sit down on Monday when you Talk to somebody and say okay. I could send you an excel file and Whether this with a CSV file, you know, you always have to think on how to really open that next of heart is something that you know And I can even do a PowerPoint for you and that Once you have automated it will be easy with Python on All the code which is not much is on GitHub I will also there's also the slides on GitHub right now as soon as I have figured out how to upload the slides for the conference You'll also have some there So you can take this and hopefully well find something interesting Another word about the philosophy about how I do things for the stock We have learned a lot about good and best practices in using Python This is I tend to follow it especially when I start more or less a quick and dirty approach So if you think about automating things in your work You may be if you have a software engineering background you may be tempted to think okay. I Will need a full testing framework I will need to have this in Docker and it should be a web application and some extra parameters Personally for things like that. I would advise against this What you're trying to do is just save you some work to get started Especially this is a new bit talk if you haven't done any Large Python and programs before I think this is a very good way to get started because you can do small things Then you can check whether they are correct And this is the well basic idea. So what I do for example is just to With exactly these API's is just to combine some time sheets So we have multiple consultants working on a project each of the time sheets I basically copy them together and see whether the numbers fit and then put I'll put them in a different format It is not especially challenging. It's I think 50 lines of code The good thing is I don't have to think about what I've done every month I just want the same script and things are pretty much okay. So what do we want to do? Basically, I've just three Tasks in this talk. First of all, we have multiple Excel files. The Excel files here Standards for any data you might accomplish it might be a CSV file. You may read something from the database You may have your little web crawler that checks pricing information on the internet So you have multiple data sources that you want to combine. That's fine then you want to transform them a little bit and write them out to some Other Excel file Okay, so far so good. You can do all this with the let's say pandas standard API and Then you want to make it Somewhat prettier some highlighting some specific Excel features that you or your organization like like Then we go to the multi more presentation focused stuff. We built some power points Once we have the power points we will return those into PDFs That's it The interesting thing about this particular task is that there are many many good tools available So a personal favorite Think of all its pandas it reads almost everything it does almost all kinds of calculations and it exports also quite a lot That's a nice Excel export for example Then you have with Excel writer and open pixel to API are two modules that are very good at working with Excel files This one hands the name and can only write them This is the one that we are going to use here mostly for let's say taste reasons. I like the RP API slightly better Open pixel can also read Excel file so you can read something in change just a few numbers write it out again For PDFs we have PDF read write which we just used to combine some on PDFs. There is In a gigantic number of PDFs library available for Python depending on what you want to do So you'll probably need to invest some time What is quite nice, but quite big if you wanted to do custom reporting is rapid lab again Something you won't see here, but just to give you a point of this is well if you want to do PDFs with Python and Want them to be more complex and want to see here. That's fine I will use leper office basically in headless mode to do some file conversations and Yeah, then we have Python PowerPoint PBTX Which as the name suggests does PBTX files? okay the overall structure is if you look at the GitHub repo. It's just the main method. So we read data We do the extra format Then we do Some nice output then we output the PowerPoints and then we do the PDFs So if you check out the codes that should basically all work You should have a couple of files. If not, let me know in which case most likely something went wrong with the share Okay, what is our task? These are Extremely made up and boring files. So the idea is we have Some project team. This project team has Expenses for materials. They want to build a mass station and they have Their working times and well, everybody has to be paid. So they have some expense per hour So these are three files that you got maybe from your accounting department maybe from your project team themselves and Now you want to combine them and do some reporting because you want to know what actually my expenses So first of all you work with pandas to load in all these files. So you have three data frames and well We're not really much to see here so With this excel files Each of these excel files only has one worksheet Okay, now you have the three data Then you want to do something that you would normally done by copy pasting an excel So if you work with excel, um, well, we can do it. Who loves pivot tables and excel Okay, that's who we are for people who uses pivot tables and have to somewhat grumbling Okay, so I will assume from that that you would enjoy to use python slightly more and well, there is practically pivot So what we do is the same thing that we most likely do in an excel file and um, if you can't follow this I don't think that you should be concerned Just know that everything that you could do in excel you can also do in pandas and thankfully especially from the um Pi data people there are a lot of very good um pandas tutorials available So I'll just walk you through what we are doing here so that you see the ap is quite well beginner friendly and the documentation is excellent So whatever you want to do a thing you can google to yourself to success okay, so What we basically want to do is we want to combine these two files So we um update we just Start from some I'm starting at some end town. We calculate overall cost by rate So we combine two or we combine the rate and the hours per update table This is basically this part here Then we add some extra column just to say what um cost type and cost type That would be and then we combine them So this is I think a pretty standard copy and paste job And as you can see it's not so much code. So if you wrote this not for demonstration purposes It's most likely the same code up here into that copy here. So maybe 10 lines of code So that's nice. We have the calculation And the calculation is now copy and and paste proof you if you have to Have to do this multiple times you can do it multiple times you can even do it on a server I think it's nicer than doing manual So this is when we end up with so we have The different cost positions where do they come from is it working time or are these expensive? Who is responsible for it? We have well two projects team dinner and mass colony. So it's two rather Let's say ambitious things And we want to output this to an excel file So this is almost the unpunded standard way of doing it. So in pandas you can do two things You can either go to excel and write out a single excel file or you can Initiate your own excel writer and the excel writer will make it possible to have slightly more control particularly here you will create well Different um worksheets So these three worksheets all going to the writer and you can give them a cheat name And now you have well some control over your excel, but it's basically still all data so Well, I think it's not really management friendly quite now So first of all what we want to do is we want to have some introductionary sheet into our excel We want a company logo version number and maybe some formatting And to do this You take a workbook Object you can get the workbook object from the writer object and the workbook represents. Well, basically your excel sheet And you can then define with excel writer some formats And there's a large number of different properties that you may want to use in this particular case I just want to say I want to sing in blue And slightly boat this ends up here So I have now created a new format and now I write it this zero zero means row zero and um line zero The api also has these um excel specific um coordinate. So if you want to go a1 Is it where you can also do this I write my text and bold in this case is this um format object. So write this out I can also go unformatted. So just over a cost thing is here And I can play around a little bit with a column width so set column takes Which column it should apply to so first column to first column. We could also go multiple columns and I'm says how Which widths what do I want to have in terms of um characters? And I'll write this out. So I have my first formatting and I can include an image Well and include typos, but That's another thing Okay, so what we have done so far we have output an excel straight from pandas and we have created a new sheet into in this excel Okay but Well, what do we want to do if we need slightly more control? So for example if my data is not in a pandas dataset Or if I have some other data that I want mic to add in a arbitrary position So you have two methods. We see the first one here. Well, you have multiple but you can write whole rows Into the system. So what we do here is just we take our columns Add them here. So we have a header and then we add The raw information in the different rows So now I have control over how to update this. So if you have something in a dict or wherever you get it from You don't need to well assume that you do an export just from pandas Now This still looked rather drab. So where we want to end up is something a little bit nicer So what you you see here is More manual control over what you're doing. So first of all We're building your worksheet And then we add two formats this time and you see the different properties that are available Think bold. We already used we have different font sizes and different colors Then we want to write the column information out again. We apply one of these Font information one of these styles and then the next thing that we want to do is to And change the column widths again. So this happens here And then we want to write out our data And this time we do it really data element by data element. So we don't use The raw function but just go over all our data And what we want to accomplish is we want to change everything that is higher than the 75th percentile in the data in red So we just use pandas to get this particular value And then we go over every row in our data We look which column we are because the first column is data is a number the rest is text And if we are above the Percentile value, we just apply red bold. So it's the other style that we defined Then it looks like this so Get slightly more. Well interesting. So you don't have these Desert of numbers anymore, but you can highlight numbers that are specific interest Now those of you who like excel too much like me Might notice, okay, but there are conditional formats in excel So I could just if I do this by hand I could do all that in excel and in terms of highlighting I could maybe do it in as a table That's also possible. So You can just say, okay, I want specific number formats like excel wants them and define those and I can Define part of my excel sheet as a table so Okay, so in order to do that you just Go for this data and then what you can also do is say I want Conditional highlighting conditional highlighting if you can own excel your work you can define custom rules And they are quite a lot of complicated properties that you can use but My all-time favorite is just this sweet color scale. So it's gave basically gives you a traffic light And you can apply those again to a specific part in your spreadsheet In this case, it's the same part that we used for the table and You have a nice colorful excel sheet Okay We'll have to speed up a little bit. So I'll just speak though. You can also create charts charts are the Most ugly part in the api and mostly because of this part here Because what you do is just here you add all the data to the table You need the data on the excel sheet in order to create a chart excellent to find them somewhere And then you create two series each series has a name in this Case could be um, well The expense information has a category that the person who is responsible for um for the price and well Once you have this You can just and create a chart You could also go for mudplot lip at this moment So it's a we already saw earlier that we could use pictures as you could the advantage of this one is that It is um an excel chart. So it's a dittable. You can change it Okay, so much for excel now. We'll get just go over the power point thing um, the power point module as a thing is Wonderful. So basically everything you can do by hand you can do with a module for excel There are still some limitations. So I didn't find a way to do pivot tables in the format that excel expects them So The thing about power point is one difference You should probably prepare some template that you want to change because you have a lot of formating information that is Will not necessarily be required for the excel file And then you can just go and fill out placeholders and write out Well Uctly tables in this case on purpose that you'll see that I did something myself So that I'm not what I'm not a designer And same thing you can add pictures to it You can add tables to it here. You will see that we use this module pandas to power point. It's I'm not pip installable, but you'll find it on github. It's a couple of hundred lines And Again, you can add charts basically, well the rp is Close to it. There are a lot of magic numbers in here and You'll probably best off starting to copy the code and then change it So it's not the nicest code But it is useful. So you I want again Get a native power point object One things I didn't figure out so far is how to take the charts out of my excel into my power point like I would do with copy and pasting Anybody has a hint there that will be helpful Okay Now we have also the power point If you want to change this into pdf you need something that's able to change and to render power points I didn't find a pattern module to do that Luckily, there's leaper office and leaper office has a command line mode So what we can do is just automate the command line mode Some of you may be familiar with that process. So what we do here is just build a command line Look for the executable add some options. These are ones other ones that are interesting and then Well executed and at least on the mac The version that comes out of the power point in the version isn't the pdf Look 90 percent the same. So that's quite practical What you can also do is pdf read write It's just read an existing Um pdf file. So let's say you have a title page an outro and some pages that you built So you just read those in And then you create a pdf writer Then you can just say take this page add it to the output take this page add it to the output and many things more Okay And that's basically all we've done. So we have now seen that it is possible to create Nice looking excel files Some people like this We have seen that it is Possible to create um power point slides. So my personal dreams in our view is that in some kind in the future I will be able to create um gun charts from excel files That would be great. So you're working on that And you can even create um create pdfs for archiving or whatever you want to do from those files There are some things that you can look into learn more Especially if you are new with pysons, this book is nice It's nicely written and um the html version is free If you um want to deep die have a deep dive into the wonderful world of pdfs Then you could also do this um On the report lab page. I think this is the Sign that I should um shut up now Yeah, and the documentation for each of these modules is excellent. I'm meaning we're in excellent So the people who did did those modules. It's it's almost heroic. It's not really the nicest work to do that But they did it really well Okay And that is Honestly, honestly, no, uh, so I think what you'd like to think you can do it via com Um, I think there's also now a product I think it's excel wings where you basically embed python into excel those is also available for the mac I'm the reason that I did is the way that I did is just I just wanted to have a script that runs without opening excel Because it for me that makes it debugging so much easier Pivot tables will be preserved in open pixel So if you have existing Excel files with lots of pivot tables And you just want to change the data use open pixel Yes, um There are some things that you have to keep in mind. So they it changes the um the file This for example means that if you have some computations in the file that excel would trigger automatically once you have for example included a pivot table You will need to trigger in excel again So you cannot um assume that everything that would happen in excel will also happen if you create the file outside of excel That makes things You can Yeah, you can The question is and does excel do it so with open pixel? Yeah, you can change the data And you can you can force the pivot tables to update Okay And you should be able to copy and paste A excel chart into pivot into powerpoint Okay, in this case we can talk about this because this would be extremely interesting It's not directly possible, but you should be able to take the spreadsheet drawing uh dump the xml and Attach it into a powerpoint file. It's not that difficult. Okay. Yeah, that's slightly lower levels. Um, Well, I went with this If you want to work with templates It's it's it should it all be doable Okay I'm the open pixel maintainer. So I know this stuff Thank you. We have time for one more question. Sorry Someone here had a question Was basically also the same question about how to Interact it with an open instance of xml powerpoint and yes, there is this open excel wings Which does this for excel and you always can do it via com, but that's quite cumbersome I'm not aware of anything Which can do the same with powerpoint, for example And if someone knows something just talk to me Um, have you have any experience with with words and templates and if you put text in it if The text is holding the template styles and also the the talk From heading Have you some experience with it that it that it works well with python? Sorry Hardly did any work or anything with word files Okay, uh, I want you to thank again seven for a good talk