 Good morning, everyone. Thank you for making it out. It was misty before but now it's cleared up Thanks, everyone We're gonna get started here in a couple of minutes Let's see actually if I could get the agenda back up on the screen, but that'd be great Yeah, so I was just gonna walk everyone through the Wikipedia data design challenge 2017 first year first time for everybody Give yourself a round of applause Yeah All right, so yes today March 4th, you know, we set up everyone made it on time it seems and here I am welcoming you We'll have our keynotes Then we're gonna do some Tutorials where we'll sort of ramp you up on some technologies before you settle into your ideas and teams and so forth We've got some lunch got some team building and then we're gonna hack So or design, I mean, I'm not sure what the verb is there the parlance, but I'm happy to learn So yeah, let me talk for a minute just about why with the Wikipedia data design challenge exists so basically Wikipedia is this huge community of tens of thousands of content creators editors researchers and Fact checkers this amazing community of people who are dedicated to this task and then we have the Wikimedia Foundation who is dedicated to running the wikis, but Really like it's the top. It's like a top five website in the world, but we're talking about 200 employee organization. There's only so much it can do So we need to also have a community of people who are dedicated to helping the that community of editors Get recognition for all the hard work they've put into getting those facts and verifiable information Into this like huge repository of human knowledge so With with those words in mind basically, you know, I want to invite our first speaker's name is machi sikloski Hopefully I got it right and basically He's been a great voice for sort of this concrete kind of community building and change In my life, and I think that he'll be a great inspiration for you as well. All right, everybody. Thank you machi Thank you so much And thank you for inviting me and I'm gonna try to be high-energy machi because it's Saturday morning And this is why I was brought in But it's not easy. It's not easy. I want to ask how many people here are born outside the United States like I was I'm not surprised and How many came to the United States when they they can still remember coming to the US? All right system a Smaller group. I came when I was six years old from Poland Sort of by accident one of my earliest memories of this whole process was going to say the Safeway supermarket and to a kid from the eastern block that was Really like kind of a heavy thing to have to see this astonishing display of American abundance It was hard for me to understand how there could be so much wealth in the world and what particularly got me was the breakfast cereal aisle We don't really do the concept of breakfast cereals in Poland certainly not back then but here was an entire aisle of of Of food that was optimized to speak to the deepest part of my kid's soul Cartoon characters telling me to eat sugar You know just in a wall of it to the end So I have this vivid memory of that and a lot of other like coming to America moments And every weekend when my mom and I went to the supermarket. She would give me a quarter and I would go play Pac-Man Being a socialist kid. I thought the goal of Pac-Man I thought that Pac-Man was trapped in a maze and he was trying to find his friends And you had to steer him over to his friends who were looking for him And so my games didn't last very long as you can imagine So of course the correct way to play Pac-Man is that you consume as much as possible while running from the ghosts that relentlessly pursue You so it's a valuable early lesson in what it means to be an American and I kind of I'm glad I absorbed it And like retro looking back at it. It taught me that technology and ethics are not really Easy to separate they took a very sweet socialist kid and threw him into the crucible of 1981 era arcade games and and The harsh lessons they teach about the world so that that technology that ran these arcade games now permeates every aspect of our lives and It's still really annoyingly full of ethics You can't like get the ethics out of the technology and just enjoy it in peace. It keeps cropping up on you and I'm annoyed by it and I'm sure you are as well And our ethical agenda as an industry hasn't really been much different from Pac-Man We try to consume as much as we can as fast as we can while running from the various ghosts that are are chasing us The closest thing that we have to a credo in the industry is the famous move fast and break things Which you know it kind of worked for a while It's like when you're a teenager and you rebel against your parents all the time And you're kind of really socking into them and then there's a terrifying moment Hopefully in your teens when you realize that your parents are just people like you or trying to make their way through an uncaring world And they're not even really that powerful and they kind of need your help and support and that that's you know part of growing up I think as an industry we've also realized that we can't just knock down every single institution and expect flowers to grow and then a beautiful forest Things have consequences and we're starting to see a lot of them You know, we're running out of things to break in fact So one reason that I think we're here is to figure out how this new institution That we've created can kind of butcher some of the old ones that are so much under threat and Then a second reason that hopefully we're here is is specific to the web And it's this big question of whether the web is going to be a Creative medium like what we kind of got used to during it's it's flowering Or if it's going to become a creative medium the way that video games and movies are Where there's a small specialist cast that can create this sort of stuff and most people's job is to consume it You have a you know an industry where you require years of training and you can do one tiny piece of it And then there's a pipeline of specialists none of whom understand the whole but together they can create these kind of Experiences that everybody else just looks at and there are any people that claim that no one single person can possibly understand the entire web Stack the technologies in a web page. These are usually the same people that create eight megabyte Text experiences that have 200 network requests per page But it's possible that their view is going to win out at the end and I'll just sound like a minimalist luddite in my in my older years Because most people now they consume something that we just call content by looking at it on small screens that are hard to interact with usually interact with when you're in the subway or on the toilet and We have this elaborate system for showing this stuff and making it fast and load quickly and an entire infrastructure to of caches and frameworks and hacks and and add things to fund it with and Running the entire show for us is a set of algorithms that nobody really understands not because they're very Difficult or complex but because they're kind of a tangled hairball of stuff that if you it turns out if you throw Enough data into you can train them up to do kind of astonishing things But without a great understanding of why that happens And they're really hungry for data. So the role of people in this consumption world is to be data factories you know everything you do is measured you're you're supposed to interact with stuff and then that gets taken in and then Presented back to you in the form of a refined product where you consume it It's like a it's a weirdly colonial feeling where you create you generate the raw materials And then you're supposed to buy the the finished goods from from the experts that sell it to you Here I have in my notes anecdote about my cat which in in retrospect is a bad thing to put in notes My cat has done a lot Let me think of the one I have in mind in particular. I lived in New York about ten years ago and my roommate who was an even like a Kind of the smartest person I ever met and in a programming hero of mine at one point called me over into the other Room and said hey, I trained your cat to do something cool And he showed me that the cat would fetch this little like milk tab that that he would throw She would bring it over to him and he would throw it and she would do it You know bring it back a couple of times until she got bored like cats do and went to sleep And he kind of was really proud of this behavior, but over the next couple of days I noticed that the cat would come play with him when it wanted to like my friend thought that he was training her To fetch but she had trained him to become a cat toy He would drop drop whatever programming task He had to do and just immediately like launch into this mindless repetitive behavior until the cat got tired of it And I think of this every time I look at social media sort of stuff where we've told ourselves that we Interact with this web world and train it to do whatever we want we train Amazon to show us interesting books We train Facebook what our likes and dislikes are we train all these things and they serve us But really I think that they train us to Click engage do whatever it is that these algorithms want not really nefariously, but just because that's the way the system Works when you ask a computer to become really good at making you interact with it it takes that job seriously and For better or worse we've seen what happens when we have this world of everybody's a consumer and the computers are trying to get us to interact they kind of They do bad stuff they they they take us in radical directions So keeping that in mind I want to talk briefly about left turns and stupid ideas Which I think are the motor of progress in on the web stuff and certainly the place we're here now represents a very very stupid idea, which was Anybody can edit anything and it will somehow work out and the fact that it does somehow work out is still surprising to me years and years after the fact So as the web was born from this series of stupid ideas What usually happens is that someone has one it turns out to work against all expectations And then smart people from the entire industry come and try to codify it and figure out why it works and build things around it And it kind of crystallizes until someone has another stupid idea and we repeat the cycle So open source is a great example where the idea that I like a finished grad student working in his spare time with a small team of You know ragtag misfits could De-thrown Unix and bankrupt Sun Industries and it seemed like a stupid idea which in hindsight turned out to be an extremely effective way of creating these tools and What happened After it was clear that it wasn't all that dumb is particularly interesting to me where For a while it looked and if you're old enough you remember it looked like it was going to be open source versus evil Microsoft in the world Like that was the battle that was shaping up And then everything took a left turn Like it turned out that open source Didn't really matter as much as we thought because it was going to be the question is not really who runs the software But who has all the data so You know the battle was worth fighting, but it turned out that we have a whole different battle about Whether the web is a centralized place or not and whether if you're clicking on Google You don't really care if the thing that you're clicking on is one of their open source libraries Or their most proprietary algorithm because it's still them versus you so there's this whole other Framing that we have now and it seems like that's going to be the battle for the few next few years to come But soon things are going to take a weird left turn that we don't anticipate and we're going to have a whole other Whole other mess on our hands, so I'm fascinated by this idea that dumb things that people try and then unexpected consequences rule us and and We never learned the lesson we always think that we can anticipate There's things on the back of the wall there that say like they extrapolate to 20 30 You know what's the web going to be like and what the demographics and other things are going to be like and we we do a really bad job of Facing the future as the mystery that it is so like I said right now it looks like this battle is between the giants of Centralization versus the open web, which hopefully we represent we saw some amazing failure modes last week in the centralized web If anybody here run runs cloud flare in front of AWS. I feel really bad for you this week first cloud flare spat out everybody's secrets into caches God knows where on the internet and then Somebody literally typoed something into s3 and brought down the entire backing store for the web so We live in interesting times when when this kind of thing can happen, but it's going to change in a way we don't predict and You know, I really like I want a lobby for you to have stupid ideas this weekend We had a really clear future in 1995 or so Bill Gates wrote a book about it We were gonna have the information superhighway and it was gonna be through your TV And you could buy books and programming and create your new own newspaper and stuff like that The really interesting thing to me about the information superhighway prediction was that it actually came true But it only came true after we took a really long detour through the web and all the stuff that we loved and never imagined coming and then 10 or 20 years later we finally settled into this boring thing where you can actually yes You can buy things through your TV now and TV programs, but it's almost beside the point because there was this whole other thing in the middle So I want to ask you to help create the other things in the middle, you know Wikis and Wikipedia are an inspiration to me because I was one of the loudest voices of mockery when I first saw this and I remember how Smug everybody was about the potential of a wiki If you remember and Carter used to come with your computer and that seemed like the closest We would ever have to the ultimate source of human knowledge Microsoft graciously giving us a free encyclopedia that fit on a CD And then I forget the name of the project but there was a bunch of professors who took a look at early Wikipedia and decided we need to build this but with actual professors and Does anybody what's that? citizenium Yeah, and I think there was another one too because that doesn't ring a bell This idea came up a lot that we need authority figures to create the authoritative Encyclopedia and all of those projects died in flames Which is also very surprising and the stupid idea just let anybody edit everything turns out to be in my mind The greatest cultural accomplishment of the 20th century Really really shocking that it worked. So what's the other stupid stuff? We're not trying We have too many smart people trying smart things. We really need the stupid stuff to happen And when I say stupid it's really actually my shorthand for this creative impulse that I think is what Ties together all these unexpected twists and turns in our in our online history This desire to do cool things with other people and try it and see what happens that results in in Really pleasant outcomes and to me It's weird that such an introverted industry where we're all Hidden inside our computers the stuff that comes out that's most meaningful is always about connecting people together and creating ways for them to collaborate and work and Surprise each other whether it's frivolous or whether it's deeply serious. That's where the cool stuff from the internet comes from so I'm here to cheerlead for the creative impulse right, but we're also here in the shadow of something a lot darker where We thought we lived in a world that for all its cruelties and its injustices was on a trajectory That was gradually heading someplace Benign and nice and that feeling is sort of gone the range of imaginable futures for us whether as as people or as Creatures of the internet has expanded to an uncomfortable degree where we really don't know what might come even in the next couple of months and The online tools that we loved and some of us have helped to build have become really effective tools of hatred and oppression To me the scariest thing is it's not a blind hatred or a stupid one It's this intelligent clever force that is full of vitality and has a lot of creative energy of its own That we're up against and didn't expect to see at least I never did and then just to put the cherry on this Saturday morning Sunday the change is all taking place at a moment when our physical world around us is also changing Permanently probably beyond our power to control and we all know that if we make it to 2030 or whatever is on the back wall We're the main thing we'll remember about this time is that it is when everything Changed about the climate about the ecology and this will be the thing that if there is a future history We'll be chapter one in that history And yet There seems to be like no way we can really connect to it to bring it into the present moment pay the attention to it That it needs it seems to be happening on time scales and and physical scales that are just too big for us to cope with So You know, I'm here encouraging you to try stupid and frivolous things in the face of what seems to be kind of Catastrophies in in society and in the ecology because the one nice and reassuring thing about the future is that it never turns out Quite like you see it So even the dark vision that you might paint for yourselves We're all gonna go into and find out that we were wrong about something fundamentally important and that things took a sharp left turn But we have to kind of help it along and make it happen My grandpa had a really nice saying that said, you know, he was a fatalist, but he believed you had to help your fate along You know, you couldn't just sit there and and be swept into it. I Also want to say that it's important that we not try to Have too much faith in the power of facts and data in this fight We can't really well actually our way out of the current situation. Hope that just with enough Enlightenment values enough data and a fact beautiful maps and visualizations. We can persuade Everybody to be a better and more reasonable version of themselves The journalism professor Jay Rosen has this really beautiful point that he makes that if you're trying to convince somebody with facts And data the first prerequisite of it is some sort of trust and without that trust No amount of persuasion no amount of factual argument will help people harden in their positions One of our tragedies as people is that we're just basically savanna apes that rose way above our station We got to like laptops and internet and that's amazing, but we are fundamentally who we are and we have to face Face the fact that we're a social species. We have to heal these rifts between us before we can Hope that the technology will solve any of these problems That we basically create for ourselves I know this is a touchy-feely thing to say, but I'm I'm I'm also here to tell you that this is a fight Like we need to win we need to treat it as a fight We're up against people who take everything that we've worked for for granted and want To bring it crashing down you know, it's it's only people that grow up in Western societies where everybody is free of of Terrible childhood diseases that you see an anti-vaccination movement and it's only in Modern technological society that you see these really anti-enlightenment forces that take everything for granted Getting You know making such headway It's just like the only people we know who are libertarians are probably rich kids who grew up and in a stable and war-free environment So you take for granted the thing that that is all around you all the time but we have to not take it for granted and fight for it and It's okay to try to understand Alternate perspectives, but you understand them so that you can beat them into dust and then fire that dust into space and hope that I say the part about firing in his face to try to enlist Elon Musk and his ilk so that they're you know They're behind us. So we really need to have this clear goal that we want to have a sunny happy community but a community that Recognizes that we're in a fight and that we have a big role to play in it one final thing. I want to say is just specific to Working with with nice people in government. There is stuff happening at the Library of Congress that you may not be aware of they were under the rule of kind of a benign but very Technophobic person for 30 years and recently got a new librarian of Congress and they have gone crazy with APIs They have put up amazing data all of it is on github all of it is The way things work at an institution like the Library of Congress is that Like they measure how much use it gets and then they can put that in their report They're like we built this API and look at all the people using it and that actually helps get it funded and get it So anything you can do to connect to these amazing new sources of information and a fairly small team there That is really excited about finally being able to use the internet for what it's meant to do Is is something I hope that you'll explore when you think about data visualization just as a small example the National Institute for the Humanities has Multiple I think 15 petabytes of newspaper scans Representing just 10% of American newspapers in the 19th century, but that's all online And they're even willing to send it to you if you send them a truck full of hard disks So there's a there's amazing stuff happening. There's kind of beleaguered islands of Enlightenment values within government and I really hope that we can help bolster that and figure out how to work Constructively with it with all the tools and cool ideas that we have represented in this room Thank you for letting me blather at you as the coffee kicks in and thank you so much for inviting me today Thank you much. I give him around a lot. There we go. Very good Yeah, you should follow Mache at at pinboard on Twitter. He's got lots of good opinions Not just what can fit in 15 minutes here. Okay, so Yeah, for our next speaker. I'm bringing up a Dario taro belly It's a it's a it's a mouthful. I'm sorry and that's coming from me. Okay, so basically He is Wikimedia's head of research and thought there's a two wonderful girls over there Setting a very good example for our future. Okay So I'm gonna go ahead and let Dario take Mache's energy and a revolutionary nature. Oh, yeah Very nice, no, but I'll give you one later Mache. Okay All right, and he's gonna tell you about all of the untapped potential of Wikipedia's data specifically All right, everyone give it up for Dario Fantastic. Good morning, everyone. I'm really excited to be here and talk about stupid ideas Combined with open data and what that can give us to help support our project So I run the research team at the Wikimedia Foundation I'm gonna say a few words about what we're doing and then Give you a sense of open data says it will release train inspire you about building things with this data so Wikimedia research is a team of research scientists you access researchers software developers Trying to leverage data for social good. So Basically, we have this very large community of readers and a Smaller community of contributors and we're trying to figure out ways in which we can use All these methods and data to grow and nurture This project, which is really a big pillar of the of the open web What does it mean in practice? We do quite a lot of research and development Wikimedia research We mostly use a set of research and modeling methods combined with open data to try and address some of the big fundamental problems that our communities are facing and These problems include for example identifying and removing biases Wikipedia Filling gaps in content coverage across the whole of Wikipedia And making the overall experience of being a good citizen contributing to an open and a body of open knowledge Something enjoyable and inclusive and as diverse as possible So this is some of the problems that we're facing in a moment and I'm gonna give you two examples of what we're doing about them so first Wikipedia is really hard if you're a newbie it's my little baby boy and He doesn't edit Wikipedia. Oh, he's not actually my little baby boy. It's just a random picture. They found the internet but I Just wanted to Give you a sense of how hard it is for someone who is not already editing Wikipedia to get started on this project And we know that if you're not a vandal meaning if you're a Good human being trying to edit in good faith Wikipedia the The chances that you get reverted on your very first headed grew by two to three times compared to when you joined the project in the early years and similarly Your probability of sticking around for more than two months has declined dramatically over the years this is really concerning because we need to have fresh blood and continue to grow a population of contributors and Why is this happening? Does anybody have a guess about why we're experiencing this problem? Facebook part of the answer. Yes But one of the other parts of the answer that we studied quite a lot is the impact of large-scale quality control systems This is the growth of the activated population over time So Wikipedia started going through a phase of exponential growth in the years between 2004 and 2006 And then what happened is that at a given point in time The the firehose of of edits coming in force a community to think of like have more effective ways of dealing with quality control and What they did is it came out with some brilliant machine gun solutions that would operate a large scale to basically deal with all this Incoming volume of edits and this is what happened quality control worked pretty well But the size of the editor population started like declining linearly over time So basically quality control gone bananas Started really causing the greatest decline in the active editor population for projects So We thought about this and what can we really do about this thing and we thought well What if we build an AI to try and socialize in and rescue? Good human beings who are trying to do their best to contribute to open knowledge and instead of being caught by the firing squad of quality control systems So that's one of the things that we built We train a set of machine learning Algorithms to detect whether an edit is made in good faith or not This is data that's been labeled by Existing Wikipedia's and what it does is that it's an open API so you can hit this endpoint You pass a revision ID revision ID is basically unique identifier of a diff For example, this is an edit that adds to an article a sentence such as llamas grow on trees and the API will allow you to answer questions such as Was this edit made in good faith? Is the edit damaging etc. etc. And this happens in real time for Over 20 language editions of Wikipedia using open data and open source algorithms The second example of a problem that we're working on is holds So you may think that Wikipedia is complete if you speak language or German or Spanish or one of these major languages But if you're not lucky enough to speak one of these major languages Wikipedia has some tremendous gaps And I want to show you a few examples of this So this is a map of all the geotag articles that exist across all Wikipedia languages And you see there's a high is high density for obvious reasons in the western world That's areas of the planet for which we have a good coverage reliable sources etc etc But this is what happens if you filter this map as a function of a languages in which we have content So this is the map for English Wikipedia again as you may expect a high density of content in the US and in parts of Europe If you look at the world from the angle of Spanish Wikipedia, this is the content you get access to So there are massive parts of the universe where Spanish Wikipedia has absolutely no content and Things get worse and worse if you look at other languages such as Portuguese. This is state of The world knowledge in Portuguese you can you can access if you speak only Portuguese And this is what happens if you speak Arabic and French Wikipedia has a 20,000 individual articles on asteroids But if you're a speaker of Hausa a language spoken by about 40 40 million people in Central Africa There isn't a single article about the universe Just to give you a sense of like the gaps that we're talking about We're looking at Wikipedia from a privileged western angle like the one that magic was talking about There's a massive need of a data science and in solutions to try and grow and Nurture this project and help us like fix these tremendous gaps in open knowledge So one small thing that we did the Wikimedia Foundation is to try and think about the recommender systems So we build systems that help you identify any pair of languages in Wikipedia currently We have a 295 languages that we support in our projects And you will specify a source language and a target language And you'll be able to identify those articles that exist in the source language, but are missing in the target language So this is an example of the English to Korean Pair seated with a copy of cheese and I discovered this morning the Korean Wikipedia doesn't have an ancient goat cheese. How crazy is that? So you can you can start using that and and go and pitch it to a community of bilingual contributors and try and grow Like this this Wikipedia contents in across the languages and This operates in any possible direction. So don't just think you can translate content content from English to Korean you can also Identify what exists in Korean and doesn't exist in English and so on So these are just two examples of how internally we're using data to help grow and nurture and diversify the world's Largest encyclopedia and again, we're really small We're a team I like to give my mother like this the statistic where if you look at the ratio between researchers that we have Foundation and full consumers of content the ratio is about like one to a hundred million Considering like the number of people who consume Wikipedia on a monthly basis. So there's work to be done here And we cannot do this by ourselves so What I thought I would do in the in the time that remains is to give you a sense of open data that we have and the opportunities that you have Using the power of storytelling and data visualization and data science To help us grow the sum of human knowledge and make it like a more inclusive more accurate and more accessible So Here's a few of the data sets that we released recently that I hope to be excited about But we give you the top corpus a Few weeks ago. We announced a big project in collaboration with jigsaw where we started modeling algorithms to Identify personal attacks and harassment on discussion pages on Wikipedia Harassment and personal attacks is a big problem on the internet is also a big problem in Wikipedia So we're trying to put some Some ideas to work in that space But in the process of doing this we also release we had to parse and extract all the comments ever made to Wikipedia articles And in the process we release an open data set with all the 95 million comments ever posted on discussion pages of English Wikipedia since 2001. That's a pretty massive Corpus that is just waiting to be visualized and Second so we receive a shit ton of traffic the Population of contributors, like we said before is tiny, but Wikipedia is among the top 10 web properties of the planet and We don't do any kind of crazy third-party tracking. We don't sell your data to anybody else, but we do receive a passively Request data that we collect in our server server logs So these data tells us a lot about how readers consume Wikipedia content And one of the data sets that we released is a click-stream data set. This click-stream data set contains a 25 million refer Target pairs that tell you basically what people click on when they visit Wikipedia And this is extracted from about seven billion page requests we collected in a given month and Think about think about it as a marble You can drop anywhere on on Wikipedia and you can follow through to figure out or where you land up following the most click Links in any given Wikipedia article these data has never been visualized before and I'm waiting for somebody to do that So similarly to this We also released a data set that we called the navigation vectors So these slightly more sophisticated than the click-stream data set. It's a data set resulting from training word-to-vec models on a corpus of four hundred million Unique browsing sessions. So people open browser Browse articles in Wikipedia We can look at that the articles that tend to occur in the same sequence So within a given session and understand which articles are related with each other And so we release this we release this corpus that's been trained like that of four hundred million unique browsing sessions It will tell you which articles tend to occur with each other not based on similarity Don't not based on links but based on what people are looking for when browsing Wikipedia And then I mentioned this project before that we worked on to provide machine learning scoring for the quality of edits We also released using the system a data set that presents the predicted quality class of Changes that occur every month Across all articles in English French and Russian Wikipedia So you take a snapshot of every single article over time And you run Like the corresponding revision through the system and you get back a score of the corresponding quality class of these articles Wikipedia's have developed over the years a Pretty sophisticated set of quality classes that determine whether an article is a top tier quality or just a stub and This data set gives you prediction at the very granular level for each single article in these three language editions of Wikipedia Finally the Wikipedia citation corpus is a corpus that we released that contains Every single scholarly article identified via a DOI or a PubMed ID Every single preprint with an archive identifier and every single book identified with an SPN That has been ever cited or unsighted because very often you also have citations are removed from Wikipedia In the language English language edition of the project So This is just about static datasets But over the course of these two days you'll have a ton of opportunities to learn not just about the static data But also about APIs and this is one of my favorite APIs for those of you who know Sparkle is gonna give you access to our the contents of wiki data wiki data is our open knowledge base and running Sparkle queries will allow you to Extract a ton of incredible information from the link data under ballet that powers wiki media projects And so with that just want to thank you We should have some great hacking time and if you have questions you can reach me I'm not going to be around for the entire event, but Drop me a line. I'll be happy to tell you more about what we're doing and what we're looking for. Thanks a lot Thank you Dario. All right So, uh, it's got some twitter information there too. If you want to follow them, uh, yeah So now I think that we're ready to be in. Oh, I love their daddy. All right. So, uh, yeah, we've got uh We've got lots of time now Hopefully you have lots of energy because we're about to begin our tutorials Which are going to go through specific ways you can engage with these datasets and uh, you know Actually get started on the many design challenges that exist within the wiki media space So, uh, if I could get the agenda back up here I'll give you a quick rundown because it's actually a two track thing To sort of parallelize and give you guys the most time for working on stuff. So let's scroll down a little bit here Is this the right document steven? All right Okay, yeah, I mean if you know, uh, since we're looking at her to do and we're airing our dirty laundry here If somebody wants to design a logo for wddc, right that would be really useful even post hoc So, uh, yeah, let's see So basically the idea with the tutorials is that we have sort of a basic track and an advanced track if you, uh, are New to wikipedia are mostly a designer ui person, uh, you know beginner coder something like that Then, uh, we have a track that'll introduce you to wikipedia and the basics If you're someone who already knows what wikipedia is Then often the side here, we have a large boat boardroom. It says small room. It's pretty big. You'll see You know where we'll go through the more specific, uh, advanced features. So on the basic side, uh I'll be kicking it off wikipedia 101 wikipedia basics You've probably read wikipedia. Wait, raise your hand if you've never read wikipedia None of you should be getting exercise right now. UV. Don't do this to me. All right, so, uh Okay, you all think you know what wikipedia is but there are some specific terminologies That I want to introduce so that's what wikipedia 101 is then UV the troll back there basically, uh, is going to introduce quarry and pause which are kind of Basic sequel and python interfaces to wikipedia's back ends This is a great service that the wikipedia foundation provides and really I use this All the time when developing our hat notes wikipedia projects Okay, then we have wikipedia. Yens is going to dial in from overseas I'm going to give you a personal introduction to the structured data Backing a lot of wikipedia increasing amounts of wikipedia and then We'll go into recent changes that is basically a stream a live stream of wikipedia data Maybe you've seen listened to wikipedia or recent changes map or projects like this That's sort of what hat note ended up being known for So it's based on it's based on that type of live stream Over on the advanced side feel free to hop between these if one looks interesting, you know We try we're going to try to keep them synced so you can attend one and the other And we're going to be running all of these advanced ones twice So if you really want to be in two hours of tutorials, you can attend all of these and then all of these All right Yeah, so we're going to go over wikipedia's api wikipedia has a huge api very big and it takes You know, it's not going to be able to fit all into 15 minutes But we have the creator of the majority of it with us right now. So you're talking to the right person And then wikipedia labs and tools This is a execution environment If you're someone who's used to writing code and deploying it and maybe you know what open stack means and that type of thing Maybe kubernetes, you know, uh, then this is a tutorial for you Um, storing and visualizing data on wiki pages There is actually a declarative syntax rather new to wikipedia where you can just define some json more or less And it will display that data no need to Render a graph not that we're trying to replace any of you But uh, you know, there are some data sets out there that we want to basically import quickly and make into graphs and finally Oauth wikipedia and all of the wiki sites wikimedia as a whole is a very large ecosystem has a lot of users And if you want to build tools for those users It's best if you don't have to also build the authorization system wikimedia has a great amount of work that they've put into security And you can leverage that through the oauth api and build logins for any tools that you build today so This is more of like the code heavy track and this is more of the data heavy track is the idea and The data heavy stuff will happen right here. So you can maybe just stay put Otherwise, uh, I think maybe we can sort of divvy up head into the room here Also plenty of donuts and coffee if you feel like restocking Any questions anything pressing everyone has wi-fi passwords Everyone's on the slack If you have questions Stephen's on the slack He's manning it. He'll he'll uh answer him straight away if you don't want to speak up Or if it comes to you later, all right Well, you all have been very patient so far and uh, yeah, we're gonna kick in the high gear now Let me know if there's anything else I could do All right, thank you These the slack is hat note dot slack.com and if you uh need an invite Stephen can get that for you. Also bathrooms are right over there on the left. You pass them on the way in