 Thank you so much Mark for your introduction. Hi everyone, my name is Jing Bo-Wen. The co-author in my talk is my supervisor, Dr. Ben Evans, he's Associate Director of NCI. Today, I'm going to share with you some of the recent data training stories that NCI has been offering to our user. NCI has been managing more than 10 petabyte research data and those data are tuned for computational intensive method and data analysis. Some of the data are also served through our data services. Our data collection include satellite images, climate model, weather simulation, observation, astronomy, geophysics, and genomics data, and bioinformatics as well, and so on. We also developed a in-house high-performance data services called GSCI. This allowed user to extract information very quickly from online using something like National Math, Ontario, and currently we are serving most of us observation data, but we hope to release some climate model as well. With the data and tools available, the traditional way to interact with HPC is you submit a job through the command line and then waiting for the job queuing system to finish and return the result. But nowadays, within the data science, people are more familiar with Python, R, and Julia word and how do they interact with their workflow with HPC? And the solution is called Pangeo. So we developed this Pangeo open source ecosystem which empower user to be able to interactively working with HPC resources from web browser, from their local computer. On the left-hand side is a tutorial to guide users through how to make this research environment step by step. So the data training that we offer is a pretty NCI specific within our context. So we need to let user know we have a whole bunch of data set and we have data services and we recently build a high performance data analysis platform. To give you an idea of what user can do, I pick up four different animation from my example pool. The first one on the left-hand side is the Cyclone Debi in 2017. You can see the center of the Cyclone moving very quickly towards inland. And this is the data from Hemorrhary 8, which is a satellite meteorology observation data. The middle one is a global temperature anomaly over the past 165 years. This animation shown you the global warming in the end. You can see the circle expanding very quickly over the last 30 years, which reflect the very quick, rapid global warming in recent time. On the right-hand side is the bushfire burn that we captured from Sentinel, which is a satellite image sensor. We serve this data through our G-ski. Sorry, the top one is a correlation of the rainfall between the model data and observation. So the correlation basically tell you how good the model is to predict the real data. We teach people how to search our data collection through the NCI data catalog and how to access data, how to extract the data programmatically for their own research. And we also teach people how the data services works and how they plug the data services URL link into an independent application, such as QGIS, ArcGIS, and Panopoli, et cetera. Those training are introductory level training. It doesn't require any prior knowledge for user to be able to get on. And we really wanted to promote this because the data service or data services is free, open to everyone. And we want as many people as possible to use our data services for free. They don't have to be even a user of NCI. The next level of training courses would be when user work with large data sets, they would have a question like, how do you manage a lot of file input and output and how to optimize the parallel IO, how to work large memory data and how to improve the job efficiency and how to run parallel job on cluster efficiently and optimize their program. That's the advanced level of topic that we teach our user. This year our full-heart focus has been developing training materials in presentation style, Wiki space and web pages. Jupyter Notebooks are another focus and we develop user guide. This is because we launched the new machine last year and we wanted to build a new cloud this year as well. For example, we just released the new virtual desktop infrastructure last this Monday. So it means that we have to make sure our documentation user guide up to date to be in the same time when we release the new infrastructure and services. When I designed this training course curriculum, I found a few useful channels. One thing I forgot to mention here is a survey that we did with our user. Apart from that, the direct interaction with our stakeholder is the best wage and the first hand information that we got. I also found the help desk ticket is very useful information resource. I use the text mining to find out the most frequently asked questions. Also from the question that user asked, I wouldn't be able to figure out the technical caveat or knowledge gap that we could address in our training session. Another driver for the training material development is that we keep releasing new data sets and releasing the new functionality of data services and we need to let user know those new things. Being an observer to the existing national and international workshops, I was motivated a lot by their ideas and styles. One of the community, which is software category, I found it's such a supportive community and I've got a lot of useful information from there. Just to share with you, what is our community space look like on the right hand side? This is a snapshot for climate model intercomparison community in short CMIP and we post information about data upload and data download and also IRADA, which is the arrow of the data that we sometimes bring offline and republish the new one. So user can come here and check what is the latest information. On the left hand side is a webpage where we offer data webinar series that run on a bi-weekly basis. We use this as a channel to communicate user by start building the introductory level of knowledge of something like overview of data collection. How do you log on the virtual desktop infrastructure? How do you build the Python environment and what is a Giskey and how do you use Giskey for your research? You can see on the right hand side the number of register has been increasing along the time and I got a feedback from the attendees that this is an economical way that they don't have to make a big commitment on both time and the money and they can keep their own pace, slowly pick up those concepts by attending the webinars. In the next few slides, I'm going to share with you a few onsite events which happened recently. The first one is the half-day training that we offered on side-by-side with the AMOS training, AMOS conference. AMOS represent Australian Meteorology and Oceanography Society and this is one of the major society we have been serving for. A lot of the user will go to this conference so we will go there too and we offer the training to the user to provide the latest information of the data and tools and platform to keep them updated. The next events that we did last year is a global hackathon events. These hackathon events are called for global attendees, those students and researchers to address challenges which is in indigenous community. Those challenges can be addressed using the open-earth observation data. So this event is a partial event within that geo week in 2019. The team were formed across Alaska, Germany, of course, Australia as well and they were given 36 hours maximum to address some of the challenges. I was amazed by the end of the day. Everyone lack of sleep, black eyes but they are so excited with what they've done within only just 36 hours and this is really one of the most stimulating events I've ever hosted and attended as well. The other major events last year we did is Australia Research Supercomputing Users Forum which is hosted by NCI in collaboration with Posi and Nessie. And the idea of this symposium is to bring all the HPC user, HPD user in the same room so that they can share the technical bits and making connections and showcasing their great work using HPC. Of course, this is another great opportunity to offer training. So we offer training in two track, same day. In the morning, we offer introductory course to high performance computing and high performance data analysis. In afternoon, it's more advanced topic. Because of the nature of this user forum, there are more advanced user than the introductory course level register. The last one is the beginning of this year just before COVID happened. And this is the fourth Australian Climate and Water Summer Institute. This summer school run every year around this time was led by Professor Albert Bundek from Furness School at ANU. I'm always glad to see so many young researchers, students and earlier career researchers from our stakeholder like a bureau, meteorology, styro, and ANU and other universities and Mary Dilling based in authorities as well. And it's a great opportunity for NCI to promote our data collection, our environmental data services and introducing HPC. Those students and early career researcher can potentially be our user along their research path. To recap what we have achieved through the training is that we found establishing and enhanced collaboration with our data expertise through training is extremely useful. Helpdesk tickets is an informative source for me to build training material. A webinar is a popular option which basically save money and time for people and be mindful this is even before COVID-19 and perhaps now it will become even more demanding and popular as an option. Our online tutorial is a focus this year and we designed a lot of tutorials for users to learn at their own pace. So please stay tuned for the broadcast of the new material that we will release by the end of this year. Whenever I work out a training session, I was motivated to do more because training is not just we deliver things to people, we also get feedback from people and we know what do we teach for the next round of training. So really it motivated the new helpdesk. Of course, there are challenges. It's harder to satisfy everyone's need and meet everybody's expectation, especially our user has come across different domain but one of the good thing is across disciplinary here is possible because when one domain see what other people in other domain was doing, they can pick up from each other. The other challenge I found is keeping material updated. For example, in my data analysis example, I'm using X-ray and dusk which are very being actively developed libraries until now. So every couple of months I have to come back and make sure the material is up to date and make sure that all the older function is replaced with the newer one. That is time consuming and overhead. The more material I have, the more overhead I have. The last point I put here as a challenge but which is I found it's hard to promote training events but from the earlier discussion today, I think I already got some good idea and I really feel like this is the right home for trainers to share experience and learn from each other. Lastly, I'm going to wrap up my presentation by sharing a little bit of my own personal feeling about training. As a learner, I found if I know what to do, it sounds pretty easy, but if I don't, it stopped me to learning things because I thought it must be very hard. But in fact, there is always a technical solution there. So through the training, the trust and bond that we established between the trainer and attendees is that they know that even though they can't resolve the problem for the future, but they know there are resources and willingness in the community there to help. I think that's part of the goal of training and that's also kind of a rewarding part when I deliver training with my passion and love. So I'm going to just stop here and thank you for your attention. Happy to take any questions.