 Thank you, Stefan, and thank you all for coming. Thank you for inviting me to speak here at your pofam. But before I get underway, I want to emphasise that this talk is largely not about my experience. So I am a Jupiter software developer, and I've done little bits of teaching for things like software carpentry. But in order to make this talk, I went to the Jupiter and Education mailing list and asked other people about their experience. And all of these wonderful people came back to me and told me about the courses that they've taught with notebooks and the advantages, the disadvantages, the things that they use, things like this. And so largely this talk is a condensation and a summary of the experience of a lot of these people who responded to my email. So I'm very grateful to them. So in the course of this talk, I'm going to be trying to answer these four questions from the perspective of somebody who's preparing to teach some sort of material and is thinking of using notebooks. So should I use notebooks? Are they the right thing for any particular course? What pitfalls are there? What should I avoid doing with notebooks? What other software tools are there that you can use along with the notebook to enhance the course? And should I host the notebooks on a server where my students can run them? Or should I encourage them to do local installations on their own machines? So I'm going to dive into the first question, have notes. Should I use notebooks? So no, you won't expect me to say that. There's a variety of technical issues, things like using notebooks in version control can be awkward. Projecting them if you've got low resolution projectors, which many places still do, can be difficult. And there are things like if some code produces a massive amount of output, then it can slow your browser down and lock it up and it can be a pain. And there's also pedagogical issues that people raised. A lot of them do with confusion with the unfamiliar notebook interface. In Southampton, where I'm based, they teach a first year computing course for engineering students. And they actually avoid using the notebook in the first section of that course. They introduce it later on because of the bit on the right. They want people to learn sort of common software engineering practices like running code from the command line and writing tests for your modules and things. And these are more difficult to do with the notebook. But on the other hand, on the left there, there's people saying that the notebook can cause them problems because it introduces an extra cognitive load on their students that they're having to learn to programme or learn markdown at the same time that they're trying to learn the subject matter. So there's kind of like two opposing ends of this argument. There's the people saying, you know, the bad thing is that the students have to know too much about the computers. And there's the people saying the bad thing is that the students don't have to know enough about the computers. So this is kind of a hint that maybe there's a niche in the middle that notebooks can fill. And indeed, there does seem to be. So this is Lorena Barber's CFD Python course, often called 12 Steps to Navier Stokes. So this is a fluid dynamics series of lectures. This is Andrew Dawes, who uses notebooks to teach quantum mechanics, in particular around this Python library called Q-Tip for doing quantum calculations. This is the UC Berkeley data rate course. So this is a data science module that they're now requiring all incoming undergraduates to take, regardless of what subject they're majoring in. This is Lex Naderbracht in Oslo, who I'm sure I've just pronounced his name wrong. He is going to start teaching a Python course for biologists this autumn, and he even gets to design a new classroom to teach it with, and he's going to use notebooks for it. This classroom, I should say, is not his classroom. This is one of the examples that he's looking at for inspiration. And a bit of a different example. This is Mike Bright, who has given a variety of container tutorials using Docker and Kubernetes and things using notebooks with the Bash kernel. So this is running Bash code in the notebook rather than Python code, and he delivers these tutorials at conferences like this one. In fact, I think he was at EuroPython last year doing one of these tutorials. So why do people use notebooks given all of the problems that we pointed out before? So the key value proposition, really, of a notebook interface is that you're combining computing and writing and mathematics. So you can combine explaining the steps that you have to do for something with illustrating those steps. And because increasingly just about every discipline involves some measure of computing, this is a very valuable format to describe the computing process. These are all reasons that people on the mailing list gave. I just wanted to pull out a couple of them particularly. So a couple of people said that using notebooks, I suppose, to teaching sort of with pen and paper, allows their students to tackle harder problems than they would writing things down with pen and paper. In particular, somebody who's teaching chemistry said that their students can tackle problems that don't have a straightforward analytical solution. So the textbooks for this subject often limit themselves to problems where you can use algebra to make a straightforward solution, but with a notebook you can go into more complicated problems. And somebody else suggested that when they do their exercises in notebooks, I suppose, to asking their students to submit plain Python code files, then because you have the markdown notes and you have the results that the student has saved in the notebook as they were executing it, you get a better idea of their thought process than you do just from the code that they've written by itself. And these were a couple of interesting things that I hadn't thought of. So to summarise kind of when are notebooks a good idea. So I think that notebooks are a good idea when you've got computational steps to a problem that you want to combine the explanation and the illustration of those steps. But if you're thinking about teaching with notebooks, then you should consider what it is that you want your students to learn. Like, do you want them to be more shielded from the computational stuff, the programming side of things? Do you want them to learn more of the sort of software engineering skills, like using the command line and writing tests and things? And you might decide to use notebooks for part of the course and use sort of regular coding in text files for other parts of the course. Moving on to the second question, what can go wrong with teaching in notebooks? So the big one is slow down. So it's really easy when you've got a notebook with lots of examples in there explaining all of your computational steps to go through going, and as you can see now, we do this, execute, and as you can see, we do this and this and this. And it's easy to go through it much quicker than the people watching that can take it in. So you have to, especially if you have all of the code already there in the notebook, you have to really force yourself to go through it at a measured pace so that people have a chance to pick up what you're doing with it. One way that you can do this is to leave blank bits that you have to fill in as you're talking. That's a very effective way of forcing yourself to slow down. You can also leave blanks if you're distributing these notebooks to the students, then the blanks are bits that they have to fill in. And this is sort of a standard part of teaching is that anything that involves people doing some sort of active process in their learning. So even if they just have to translate a formula into a bit of Python code, it's a much more, that sort of thing sticks in the mind much more effectively than just reading and watching passively. So you can consider doing this. And one thing that we do, for instance, in software carpentry is we leave blank bits for the exercises that students are expected to do. But then there is a solutions notebook in the same directory that they have. So if they're behind and need to catch up with the exercises, they can go and look at the solutions. That works for software carpentry because it's not assessed or anything that the students are just there to learn. So we trust them not to cheat because there's really no point in cheating. So moving on, again, extra software tools that you can use with the notebook. So this is NBGrader, which is a system for creating and using the notebook as assignments. So students have to fill it in and then there's a system for bringing it back and marking it. So this screencast here is illustrating the process of creating a notebook assignment. So you can see there's a sort of extra cell toolbar which is provided by a plug-in. And you can select these cells as being automatically graded answers or manually graded answers. And you give them IDs and you assign numbers of points for them. And it's pointing out that there's a total number of points up at the top. And then from the student's point of view, this is what they see. So there's another plug-in that you're running on the student's version of the notebooks. They get a list of assignments that they can go and download. And that assignment is now a collection of notebooks. And for each of those, the student has the option to run the automatic parts of the marking before they submit it. So they can see how many of the tests it passes or fails before they send it in to you. So in this case, the student hadn't done anything yet so when they clicked Validate, it failed. Now this example student is going to go and find the necessary bit of code and fill in the code to do it. And they'll save that and go back to the overview screen. And when they validate it again, then now it says this has passed. And then the student has a button that they compress to submit this whole assignment. So all of these notebooks. And then on the teachers or the markers point of view, NB grader can take care of the automatic part. So it runs the student's code. It runs the test cells. It automatically assigns marks if it passes the tests in those cells. And then there is an interface called form grader where the marker can go through and manually adjust those marks and add marks for written answers and things. So you're not limited to just questions that have an automatic mark, which is very important because you need to check that people actually understand things as well as that their code works. And NB grader also includes things for then collecting those marks and exporting them into different formats and for giving students feedback so you can make notes on their answer, tell them this is what you did wrong in this place. So it's not just about the score that they get. It's about how they can improve as well. Another tool for the same kind of thing is OK Pi. This is actually both of these auto grading systems were built at UC Berkeley. This is the one that they're using on the data science course that I showed before. This has, as you can see, a very slick web interface from the teacher's point of view here. It's not specific to notebooks, so I would guess that it doesn't have the same level of integration with the notebook interface for creating assignments and things that NB grader does. But if you want to mark notebooks and other kinds of code submissions, then this is a very neat interface. And at the moment I think they even provide this as a hosted service for free. I don't know how long that will last. So another group of tools here are for posting notebooks, which we're going to discuss whether or not that's a good idea in a couple of minutes. And there are two main options that people use here. So Jupitahub is our open source DIY solution. This is Python and JavaScript software that you can install on a server, either a server that your institution already maintains or a cloud server on Brackspace or Amazon or Google or Microsoft or whatever that you have. And Jupitahub, you can set up different sizes of things. So if your students need to solve problems that require a GPU, then you can ensure that this is running on computers that have access to the necessary GPU. And it can be integrated with different login options. So you can plug Jupitahub into your university's single sign-on system, or you can integrate it with GitHub logins using OAuth. Or if you don't want to do any of that, you can just use a standalone login and give students a new username and password to access it. The other main option is CoCalc, which was formerly SageMathCloud. This is what you do if you want somebody else to take care of it for you. The somebody else is William Stein from the University of Washington who is doing this as a startup. CoCalc costs between $4 and $20 per student, depending on how many students you have and how long you want the course to last for. So you can choose a four month course or a full year. And it has its own integrated set of course tools, which include some really fancy things like the instructor can even go in sort of live and remotely collaborate with a student and give them pointers as they're working on it. So before we come back to the posted or local install question, there's a handful of other tools that people said they were using. So one sort of family is for converting to and from notebooks. So NB Convert is a standard part of Jupiter for converting notebooks to other formats. But there are other tools that you can use. So some people like to write their notebooks in restructured text or in Markdown because they're a big fan of their editor. This is often people who are big into Emacs or Vi like to do this. And there are tools that you can do that and then convert it into a notebook file. There are also tools if you want to, for instance, have a collection of notebooks and then convert that whole collection to one big PDF handbook using latex. Then you can do that. There are ways to use notebooks as slideshow material. So NB Convert has an option to go to slides. There's also a plug-in called Rise for reveal IPython slideshow environment, which gives you very similar looking slides, but the code in those slides is still editable and runnable so you can be changing things on the fly while you're doing your slideshow. Finally, it's possible to programmatically generate notebooks. I don't have an example to point you to, but one of the people who responded said that he's randomising questions for assignments and the notebook file format and the NB format Python library make it quite easy to generate notebooks if that's what you want to do. Coming back to the hosted or local question, there's really a range of possibilities here. So going from on the left, CoCalc is in the cloud. Somebody else deals with all of the technical details. You just give them some money and put your students' email addresses in there and everything is set up for you. There's Jupyter Hub where you run the survey yourself, but the experience from the student's point of view is broadly the same. They just go to a URL and do everything in the browser. If your IT department is co-operative on this sort of thing, then you may be able to get it installed on the institution's managed desktop systems so you can go and do a computer lab with the students using the university computers, or you can get the students to install it themselves on their own computers. So if we simplify this to the two main possibilities of either it's done for the students or the students have to do things themselves, there are a few obvious advantages on either side. So with the hosted solution, there's nothing to install. Installation can often be a pain, so this is a big plus for a lot of people. Students can use it from their tablets. They don't have to have a laptop to set it up. Anecdotally, one of the people who does their course like this said they tell students they can bring a laptop or a tablet, and some of the students do bring a tablet. We suspect that if they told the students you need a laptop, then most of the students have probably got a laptop, but some students prefer to work on their tablet. On the other hand, installing it on students' computers is free, at least assuming that students already have computers, which I think in most western countries is probably the case, and you're not at the mercy of any part of the network connection, whether that's the network hardware on your laptop or the institutional Wi-Fi or the broadband backbone over to whatever server you're using. Any of that can go down and that can interrupt your use of a hosted service, and this is a problem that quite a few of the people who are using local installations pointed out. Anaconda, by Continuum, has made all of this stuff a lot easier to get set up, and I would say almost everybody is asking students to install it themselves, is asking them to use Anaconda, because it's made it so much simpler. There are a few people who disagree. These are a handful of perhaps less obvious things that you might want to consider, so most of the automatic assignment and grading tools only work or work best with hosted solutions. I believe work under way in NBGrader to integrate support for local installations as well, but it's also trickier because platform differences may mean that if somebody has got a slightly newer version of NumPy and they use a method that's only in their new version and not on your grading server, then it works for them, but when they submit it, then the tests fail, so there's problems like that to be aware of. People also suggested that there are equity issues, so using a local installation may privilege people who have nicer computers to do it on. It may privilege people who have already got the technical know-how to easily install it and need less help with that. On the other hand, the advantage of having a local install is that the software tools and the materials from the course are readily available after the course is finished. I should say that William Stein, who makes Co-Calc, vigorously disputes this and says that people are no more likely to keep using that on a local install than on a hosted installation. It's questionable whether, yeah, depends on how powerful the free tier of the hosted services and whether students are willing to continue paying for the non-free tier. They're students, so they're probably not going to pay unless they absolutely have to. My overview of this would be, are the computing skills a key part of what you want students to get from the course? If you see the computing skills as something incidental that the students just need to use a computer to learn about this really important material, then a hosted solution probably makes everybody's life easier. If you want them to come away with those computing skills, then doing a local installation is probably worth the trouble. There are people doing local installations for classes of up to 500 students and they say it is doable. It's also possible to combine these. Some people said that they do primary local installations but ask students to fall back to the cloud solution if that isn't working. Some people say that they have the cloud thing as a primary, but they encourage students to install it as well. I'd like to thank once again everybody from the Jupiter education mailing list and these three foundations, the Moore Foundation, the Sloan Foundation and the Helmsley Trust fund our work on Jupiter. I think we have a couple of minutes still for questions. Thank you. I don't know. Oh, we've got a microphone coming. Thanks. Thanks. That's really interesting. In your introduction, most of the examples you gave were of scientific kinds of teaching, but you also mentioned using containers and bash. How well adapted is this to that kind of work? I'd be really interested in knowing about the practicalities of it. From a technical point of view, Jupiter has first class support for the notion of plugging different kernels into Jupiter so you can have different languages running inside a notebook. It all came from Python, but then we generalised the idea to support other languages. The bash kernel is actually something that I made and it was initially supposed to be just an example of how to make a kernel for Jupiter. I didn't really think it would be something that anybody was interested in using. Then it turned out that people weren't interested in using it. It works pretty well from the point of view of using bash. What is tricky in all Jupiter kernels but becomes more of a problem in bash is that if a sub-process wants to do interactive input, so if you do condor install whatever, condor will say, these are the packages that I need. Do you want to continue? Yes, no. Jupiter and the kernel can't tell that a process is waiting for input, so you will see the output of that cell will say yes, no, but then there's no way to actually send the yes, no back to that process. That's kind of a limitation as you have to write all of the commands with the flag to say, don't prompt me please. I have been using Python in the notebooks only, but can you mix Python and bash in the same notebook? Yes, but it's not part of the Jupiter kernel system. Jupiter's conception of a notebook is that each notebook just has one language, but the ipython kernel for notebooks has some of its own support for different languages, so if you start a cell with %bash, then all of the rest of that cell will be sent to bash to run so you can mix languages like that. Okay, thank you. How much difficulties do we still manage a Jupiter hub for example on Ubuntu server? I would say probably Ubuntu server is probably the most common target platform just as a guess. We aim at sort of not technical novice, but like if your system administrator is that postdoc or PhD student in the group who's good with computers, then we aim that Jupiter hub should be practical for them to manage. It gets more complicated if the size of the class is large enough that you want to spread the users over multiple servers, but there are people who do that and I think there's pretty good documentation on how to do that, so it should be feasible for somebody who's sort of familiar with Python and Docker and things like this. Hi, sorry. I've used notebooks before while teaching at university, so I didn't know about the NB grader plug-in and that sounds quite interesting, but I found just using the vanilla without the plug-ins getting students to submit the notebooks is the file size, so we tried to tell them to clear all the outputs, but many of them didn't, so we get very, very, very large notebooks. Is there any automated script to kind of pull them out or is there any suggestions for that in the future? Yeah, there are some scripts that will clear the outputs of a notebook. They're primarily around version control because it's also a pain for that. I think one of them is called NB Strip Out if you search for that. Okay, thanks. Thank you for the wonderful talk. My apologies for not replying for your email on the thread, so I won't add a couple of points here. So one thing you mentioned that's slowing down, right? So that's one reason, I mean I do teach Python professionally. I use notebooks heavily, so one of the reasons I realise is, one of the things I realise is that using a prepopulated notebook makes me go faster, so what I usually do is try to do live coding in the class in the notebook and then share the notebook after the class. That gives a chance for people to go through the thinking process of how I'm approaching that solution, which is usually absent when you give a notebook where everything fell in. I think that's an interesting observation. I wanted to make it. The other one thing that I think that says the duration of the course when you're using notebooks, there are some things that depends on how long the course is. If you're doing for a month-long course and you're doing for a day or two-long course, there are things, for example, the installation and set-up and all that is fine to take long time if it's a one-month-long course, but if it's single-day course, you can't afford to do all those things. Also, I realised when you're using NBGrader that works well for long duration courses, whereas it becomes very tedious for teaching one-day or two-day courses. I don't know if you want to share something on that. I haven't used NBGrader myself. I could imagine that it's probably too much work to set up and get people familiar with for a short course. In terms of the installation, I've done software carpentry. That's typically a two-day workshop that's two days sort of intensive stuff. We get students set up with Anaconda and for Windows users and things like Git Bash on the morning of the first day before doing the teaching on the morning of the first day. I think it is practical even for short courses to get the stuff installed if that's what you think the students should be taking away from it, and for software carpentry it is. I think we're going for lunch now.