 Hi, can you hear me well? OK, perfect. So yeah, today I would like to speak or perhaps even talk with you about Python, obviously. We are in a good place for that, about data science, and about open source and the education. So many, many topics, actually. And just to make this very clear and very beginning, I'm not a programmer. So it's like my background is in diversity, my background is in sociology. I also work with companies like B2B. But for many years now, about seven or eight, I'm working about diversity, and I'm working about educational program in programming and in data science. So yeah, just please don't ask me the question like, OK, so how I can become a data scientist in three weeks? OK, so yes, first of all, I have no clue. And second of all, it doesn't work like that. And the second question, which I would not answer or just staying here and smiling very broadly to you, it will be, OK, so which algorithm is better? Is that algorithm A or algorithm B? OK, fantastic, yes. So we are very lucky to have many people in here who will be very happy to answer this question. That's not me. And also, in terms of this question, it's like perhaps if you will ask that kind of question, you will get an answer. OK, could you be more specific? Could you tell me something more about this issue? Yeah, just so you know. And in terms of, I would like to learn something about you. So how many of you are writing Python code less than six months? OK, perfect. How many of you is writing in Python more than two years? OK, and how many of you have run more than 100 data science experiments? OK, you will not learn so much. Maybe something about open source, I hope. OK, and how many of you are very, very fresh, consider themselves as very, very fresh in data science? Oh, if you are in the right place. OK, so this presentation, I hope that will be helpful for you in a couple of ways. And the very first way, those are resources. So I prepare more than 20 links for you about open source, about data science, about Python, about different groups all over the world. And you can find all these resources, as well as the whole presentation on my GitHub account. So yeah, please just use that. And also, it's CCBuy, so you can use that freely. It's just a matter of if you will say that, OK, the author is Kamila Stepniowska, and this kind of things, then you will be OK. And so some basics about open source, some basics about data science workflow. And after that, something which I hope it's very helpful and not only for beginners, it's the way of thinking that if you are learning something, it's good to have three things in your learning experience. And one is working on projects. The second one is cooperation, so cooperate with other people. And this third one is contributions. So it's not only about learning from others, it's also giving to others. For example, giving a talk, giving a lightning talk, prepare some open source materials. So I hope that's something which you will take from this presentation. OK, so shall we? An open source. So basically, I believe that many of you are familiar with open source. We are in really good place for that. But just to make some kind of reminder, and it's about free use, free modification, free sharing. And as a user, you might consider two cases. Like one case is like if you are using some materials which are text, which are pictures, which are videos. And that kind of materials, perhaps, will be on if those are open materials that will be on creative commons. And if you are writing a code, then perhaps this, sorry, if you are using a code which is open source, then perhaps that will be on one of very popular licenses. And that might be MIT, that might be GNU, that might be actually many, many others. Apache is also very popular. And one thing which is really cool, for example, for a GitHub, it's like when you are doing requests, you can just choose the full license from in a GitHub. You don't need to necessarily copy and paste that. So just as a user, if you are using some materials which are open source, just please keep in mind, if those are creative commons, then you have couple variations. And the most important is just to remember that the basic one is CCBI. And CCBI allows you to use material freely, to share material, to modify, to use that for commercial and non-commercial purposes as well. And then you have a couple variation which basically are different variation of if you need to use the same license or not, if you can change the source or not, and if you can use that for commercial purposes or not. So those are basics, very useful basics, I hope. And if you are a creator, if you are building your own code, or if you are building your own text, video, or other kind of material, those are all links which hopefully will be very helpful. So basically in terms of general use and general selection, Chose License is a very good website for attacks, basically creative commons, and for code opensource.org. So please remember that will be on the presentation, which is available for you. And let's go to Python. So asking a question, why Python in this place is a little bit tricky? So it's like, maybe actually someone from you can answer me, why Python? Why do you use Python? TensorFlow. Oh, great. Yeah, so data science, basically, yes. Sorry? They rub the word. OK. Yeah, in a way, definitely. And one thing, why they or we rub the word, it's like, because it's a community. So it's very easy to just create things, and very easy to share these things, and very easy to contribute, actually. Nice. So yeah, welcoming, supportive, very good for very beginners. So in terms of general learning experience, I hope that building project, finding a project which will be interesting for you, finding a project which you are very dedicated to, finding right people, and finding the way how you can contribute. I hope that's helpful. Yeah, in terms of being a beginner in Python, it's good to know PEP number eight. It just will help you a lot in terms of how to use Python properly, how to make a good practice, how to have a good style. And it's something which you might not think about on the very beginning, because you just want to write a code and you want it to work. But it's something which will help you in advance. Oh, Zen of Python. So you might have been on Lightning Talk yesterday, and it was a little bit of trolling about Zen of Python. So why is it not helpful? I would say it was a trolling and a really good presentation, by the way. But the spirit of using Zen of Python is more of a very, very high level. So please don't take that very serious, but in a very, very high level, that might be helpful. If you are a very beginner in Python, actually, Python Software Foundation is the best place to go. So Python Software Foundation has really great resources. And I really like that they are giving resources for programmers and non-programmers, because it's very helpful. It's like the different state of mind. And actually, in the resources, you can find books, you can find videos, you can find tutorials, you can find many other kind of resources. And it's updated, so it's alive. And you might know Lynn Root. She will be also a keynote speaker. And her talk, which was, I believe, very first time, she done this talk in Europe, Python in Florence, I believe. And sync or swim, you have not only these rules how to learn, how to code, but for beginners. But you also have some projects, which you basically can run yourself. So there is multiple projects on API, for example, on some chatbots and some other projects, which you basically can use for your education. And OK, so data science. You know, machine learning, it is how it is. Like, we are searching till we'll find the right answer, the right answer. That's the most important. But besides that, why data science and Python goes together very well? I'm not sure if you are familiar with this survey. 2017, these results, it was done also. Like, PyCharm, in a way, had some finger there as well. But it's a very good survey, which was taken on more than 9,000 developers from almost 150 countries. And it gives really good knowledge about how Python is used by developers nowadays. And data science actually are very, very strong in this survey. And what kind of technologies in data science? So you can see this very popular now, which is Pandas, which is a stick learn. You also have many, many more. That's a good slide for those of you who are new into data science just to check what's there. And a couple more, which might be helpful. So PyCharm and Spyder, those are IDE. So those are these environments which you will be using. Spyder is basically for Python use. PyCharm is for all languages purpose. And something which actually I was using very often is Jupyter Notebook. It's something which is very helpful in terms of trainings, especially because if you are building a training and you would like your participants to basically work with code like in the real time, that's something which can help you a lot. And it's very easy to prepare. And it's Jupyter Notebooks, Python Notebooks, that was the previous name. It's something which is very helpful in education of programming. So in general, Python in data science is something which you want to consider to use as a tool to build your tools. So it's not a purpose as a purpose to just use Python. It's something which is a programming language which can help you to just build what you definitely want to build in data science. It's just a tool in this way. And some words about how data scientists every day are alive and every day work looks like. So a bunch of the time it's just preparing data. So it's see what you actually have in data set, see what you are missing, see what kind of errors do you have, so cleaning, clearing data, and then just praying to have enough of them to run your experiments. But then there is a fun part. The fun part, which has, that's just an example of how you can think about that. But the most important things from the slide is like the understanding of your problem and the understanding of your issue which you are focusing at. It's crucial to just understand what kind of data do you have, what is the input, what you would like to be an output. It's very crucial. And it's something which, on very beginning, you might not think so seriously about. And then you have this really fun part between search and experiment. Actually, it's like going in between, searching, experimenting, searching, experimenting, searching, experimenting. So those are many steps to take. But it might be really a fun part. That's a question. When I was preparing myself for this presentation, I asked a couple of data scientists, my friends, how do they actually find the right algorithm? How do they find the right sources? And the very first version of a question which I was asking, it was, where do you find the right resources? So I got a very simple answer all over in the internet. OK, yeah, great. That's good to know. But what are the criteria? So how you can just decide that one algorithm is good for something and for your purposes? Or it's not? So you will have your own judgment. You will use your own judgment here. But some good practices. If you are more experienced in Python, you can just see the code and see what's there. Try it. Try to modify that. So that's one of the very good options. And basically, as always, as in science in general, so see the resource, see if those people, what is the credit of these people? What is the credit of the source? So nothing new. It's just you need to try. Some hacks. So that's for more advanced people, actually. It's a tool which, if you already run more than 100, 1,000, couple thousand of experiments, and actually you have these issues which you will have at this stage. So if you need to be really, you need to know what's going on in experiments and what does it mean. And so Steppy, it's a library which might help you. Yeah, the very two basic abstractions. There are steps and transformers. So that's something which you should just take a look. And a good thing about data science and some resources for data science, it's at this website, at Data Science Masters, OK, it's all purpose. It's even if you are advanced data scientists, that might be a website which you actually would like to consider to just take a look at there. It's something which might be very helpful. You have a bunch of resources and you have there like both videos. Also, you have regular tutorials. You have text. You have bunch, bunch of things there. And do we have any mathematicians or, let's say, mathematicians in the room? OK, there is one. OK, perfect. And there is a second one. OK, but you are advanced, so it's not for you. But if you will be a mathematician or a physics and you would like to start with data science, there is a blog post by Piotr Migdal, which might be very, very useful. And it's like he made himself the switch from science, from basically from mathematics and physics into the data science. So it's like very personal blog, but very, very useful things, like useful hacks which you can find there. OK, how much time do I have? More than 10. OK, perfect. So how you can learn? Basically, projects, cooperation, and contribution. And in terms of projects, it's something which is very good for very beginners. If you are the beginner in a particular programming language, if you are a beginner in data science, if you are a beginner in anything connected, in my opinion, anything connected with coding, it's good to find your project. It's good to just find what you actually would like to build, why you want to build that, and then check if it's possible. So I would say, start with the sky is the limit and then see what's actually possible. The other way, let's see what's there. So let's see some projects which are in data science and see if there is something which is interesting for you. So in here on the slide, you have free resources where you can find data science projects. And you can basically just see if there is something which you feel connected in terms of, OK, yes, I definitely would like to build something similar. Or that's the topic which I would like to just dive deep. So yeah, that might be a good thing to do. And cooperation. I'm very happy to see a gentleman with piloted t-shirt. And in terms of cooperation, in terms of how you can learn from community, how you can be a part of community, how you can just build community, which will be very helpful for your learning experience. But it will be also, you know, it's just nice. It's just nice to be a part of a community as well. It's safer and it's just nicer, I would say. But in terms of offline things, ladies are very lucky in a way. Like there is piloted, there is Giger's carousel, it was something which I was co-created. But there is also Giger gig, which are very nice. They are also have their appearance in here in Edinburgh. And there is a bunch of local groups which have usually there's monthly meetings and usually they also build or just conduct some workshops or some conferences. And you can find a bunch of information about these local groups in the internet. And you can think about that as also one life hack which I have. If I'm in a new city, I'm traveling a lot for my work. Just because I like as well. And if I'm in the new city, I'm checking if there is, for example, piloted meetings or if there is a women who go to a meeting. For all of us, like not only women, there are definitely like PI data, which are also global. And that's a community which also have great meetings. So that's something to check. And basically for online appearance, so PI Slack definitely, this Python mailing list, this tutor, Python mailing list, it's something very helpful. If you have a particular questions, you also can see because this mailing list is really old. So many questions were already answered. So you can just go to our haves and you can see if you will find the answer for your question. So that might be a good idea. And there is a group on Facebook, Python programmers that might be also helpful. And a contribution, the best part. Backtracker is very easy. It's like if you will go for this website for bugspython.org, then you will see a bunch of requests for basically backtracking. And you can see if the bug is taken or not, if someone is working on that or not, if you can contribute there very basically by just taking some back and try to fix that. And bugs are on very different levels of, some bugs are very simple. Some bugs are very advanced. So it's something which might be good on many stages for your journey with Python. Also, bugs are fixed on a sprint. So sprints will be Saturday and Sunday on Europe Python and on PyCon there are sprints as well. So that's also a very nice occasion to just meet core developers and to be more involved. And in general, not only in Python, but contributing to open source projects, there is this open source guide which can answer many questions. Yeah, basically PySlug, which I mentioned before. And PyData, which I also mentioned before. But you can think about the data as an attendee. But you can think about PyData also as a speaker. So it's always good to just, if you have some topic which you would like to share with the community, or even if you want to challenge yourself. And you do not have a topic, but you would like to just find some and share with the community, prepare this talk, make some effort there, and share that with the community. Then PyData is a good place to just contact local organizers and say that, hey, I have this talk, or I would like to be a speaker. If you have some preview speech, then it's always good to share that as well. If not, it's also just good to reach out. And in terms of workshops, this Django Girls workshop, they made a really, really great job. Because Django Girls, they prepare not only the tutorial for workshops from Django for beginners, but they also prepare them all like setup, how to prepare this workshop, what you need to focus on, how to speak with a venue. So very, very detail-oriented research which you can use to prepare Django Girls workshops. So that's something which, yeah, they made a really great job. And something which I was not able to just fit into any of other slides. That's an idea of open education in general, open education on academia level. It's the idea, basically, it's like you are, if you are writing your paper, that might be from STEAM, but that might be also from sociology, that might be from different perspectives as well. And you would like to share that openly. You would like to make that open source, but in the same time, you want to make that properly. You want to make sure that the paper will be reviewed, that other experts will just take care of this paper, will just see if it's something which is viable, and you will get some feedback as well. So that's rather a new project, but I keep my fingers crossed very strongly for this one. And basically, yeah, now I would like to start a discussion. And actually, I'm very curious about your stories. Do we have a mic? Oh. So basically, if you don't mind, it would be great if you will share with us some things about, like how did you start to learn Python? How did you start to learn data science? And actually, yeah, I'm very happy to just answer some questions if there are some. Yeah, it was like you mentioned the Jungle Girls tutorial, which I think is really great, because as you say, it's really starting from scratch also for people. If you don't know about that, it's like explaining what's a text editor and how to use Git, and all these things that most people consider is like a prerequisite to become a programmer, so it's really good for non-technical people to start Jungle on Python. But the problem for me, it's not really a problem. It's focused on Jungle, and I've got a lot of people asking me, OK, I want to do data science. And what resources can you advise to start in data science from scratch? And the question is, are you aware of any effort to duplicate these kind of Jungle Girls tutorials towards more data science things? Do you know if anything like that exists? Is there any? That's a really good question. So I haven't find anything like that. So basically, I just would like to be on this. No? Yeah, so unfortunately, I haven't find anything like that for data science. I know that a group in Krakow, actually in Poland, a group of GIGRs, Carrots, they are working on some data science tutorials for very, very beginners. That's a different group than the Jungle Girls. But I know that they are working on that. That's not released yet. But in terms of Jungle Girls, that might be actually really good to just speak with them and see if there is some way how you can combine this all knowledge which they have in terms of organizing worship, in terms of work with the very beginners. And I'm sure that they are very open for that because it's open source. So basically, that should be possible. And then, for example, get this information from this data science masters and just combine something together. So unfortunately, I haven't find that kind of workshop yet. But I'm sure that will happen. And if you would like to speak about that after, then I will be more than happy. So I actually have an answer for you. You should look at software carpentry. It's a bit hard to actually organize an official event for software carpentry. You have to go through their training or so. But all their materials are open. So you can just use them or adapt them. Yeah, thank you. That's helpful. Not a question. The audience. Anyone who would like to share the story, like how did you start with data science? I'm really curious. I'm serious. So I have a question for you. One of your slides was all the Creative Commons license. And there is big discussions about some of the restrictions that people decide to apply to that content, especially the non-commercial one. What's your opinion about that? In general, I want people to have a choice. So if, yeah, in general, I want people to have a choice. So if they want to choose that their work might be used only for non-commercial purposes, so no one can use this work to just earn some money in direct or not direct way on that, then I'm fine with that. Like things which I'm creating usually, you can use that for commercial use. That's my choice, but I really believe that other people should have this choice to make of themselves. I'm not sure if that's answering your question. Oh, yeah, that's answer. Looks like you wouldn't say something. No. I have another question for you. You mentioned Jupyter as a suggestion for people to use. Did you ever try Jupyter Lab? That's people, the same thing that Jupyter notebooks are building as their next ID. Did you have any experience with Jupyter Lab? Not yet, but I would like to learn something more. So if you can actually tell something more about that, that would be great. So for those who doesn't know, the same thing that was building Jupyter notebooks, they decided to build the next ID on top of all the technology that they was using for Jupyter notebooks. So in terms of technology, you have a kernel and you use mzq, if I'm not wrong, to connect the web browser with the kernel. That can be a Python kernel, it can be a R kernel, not a language that's going to process the cells and going to send back the results. So they decided to use that technology, build a full ID, and it looks very promising as far as videos and it's still very early days because they only reached the version 1.0 a few months ago. But it looks very interesting and they still support the Jupyter notebooks. You just make it easy for you to access all the local files and have all the information on the screen. Yeah, I will definitely take a look at that. JupyterLab, sorry, I don't think that anyone kind of wrote any tutorial yet. Like, as you early days, you can find videos online of people demonstrating, but it's easy to install as any other Python library now. You just need to pip install JupyterLab. JupyterLab, but it never uses it. So I wonder, is there documentation for that? There's some documentation on the internet, so you can Google. Any other questions? What about the MOOCs, Cosera, and EDX, and things like that? Oh, that's a good question. So it's like, if you all think about the way as you are learning, as a learning experience, so if you will think about it, not only about some courses which you have in the internet, which sometimes, like, Cosera or like, bunch of other resources like, for example, by... Okay, I think we don't need that for now. So a bunch of other resources like, for example, created by a general assembly are some other online schools. Those are helpful, but it's very... Like, you will find a tutorial there. So you will find, okay, most of the time, very academic way of thinking about some issues in data science, which not very often, which very often they are lacking this information how and why actually you will need to use that. So what I really prefer in a learning experience in general is like, you have this project which you can work on. Then, yes, for this purposes, you can use, for example, for example, Cosera and find some courses on very typical, on very particular projects, problem, sorry, so you can find that, but it will be not like doing only this kind of courses. It's not enough. Actually, I was very happy and lucky actually to work with many women in Europe and in the U.S., like mostly women, not only, but mostly, to help them in this transformation from one of them. Like, for example, for working in finance into data science or into programming in general. And what was working very well for an adult who already has some kind of job, it was like if they had a project, if they had people around who support that, and then resources, it's something which is necessary, but it's not the most important thing. Yeah, I just take this a little bit around, but if you would like to, I have also the list of commercial educational materials, so if you are interested in that, I will be happy to share. We still have time for one question, if anyone from the audience wants to ask. Okay, I'll not tell my story because it's a bit long, but one takeaway from my story is that the workshops you mentioned, like Django Girls and stuff like that, are really great if you can find one. It's also fun to organize them, but it can be a bit daunting at first. So it takes three kinds of people for these workshops. You have somebody, some teacher, some mentor, people who learn, and the organizer. And the organizer is kind of a bit overlooked, but if you want to learn something, it's usually not that hard to find the mentor. If you know where to look, if you go for example to a Python meetup, you will find experts who will be very happy to actually teach someone. So if you want to learn something, I would recommend going to a Python meetup, fishing around for someone who wants to teach, and then organizing a workshop. It doesn't have to be big, you can have three people, one mentor. And if that goes well, you can then scale up, go to the full Django Girls or some big workshop. That's really good tip. And actually, on the beginning, you can think about that as a hackathon or just hack night. Hack night, that's a good name, so it's like to find this mentor and to just organize like three, four people and just set up the place. So for example, some coffee shop, which will be nearby, you know, that's in the place, you will have a table, which is like, will fit four people. And that's it, just announce when that will be and like in that kind of group, just emails or a messenger or anything like that and that should work. Yeah, thank you for this comment. Thank you for all the questions from the audience and I want to thank Camilla again. So a round of applause for her. Thank you.