 My name is Jin Bo-wen. I'm the training manager at NCI. Even though I work here for about eight years, the new role of a training manager just to start the beginning of this year. In this talk, I would like to share the new training strategy that we established this year some thoughts about how do we measure the impact. I think you would all agree with me that digital skill is one of the key elements here to underpin the integrity of research and this has been addressed in the NCI the next five years roadmap and training is one of the area that we want to focus and help the research community. At the beginning of this year, I've done quite intensive consultation internally external stakeholder one-on-one interviews plus hundreds of response from user survey about what the requirement on the training basically the message is quite consistent across all stakeholders saying that yes we need training at all different level across different topic and we actually identify a couple of gaps in this exercise. The first one is that we know the technology moves really fast but we also observe the community pick up the technology at a different pace. Some of them are falling behind and we are thinking why is that? I think it's quite obviously that some people may not even know that exists and sometimes people may think oh I know it's coming but I don't see the immediate impact on me so there's no motivation to pick up and learn those things right away and also I think in the e-research two weeks ago some some people raised about the question how the Haas community pick up the AI. I think that's a good example to show that they wanted to get on top of those technology but they don't know how and where to start with so I think that's where we can help. The second the gap we identified is the missing fundamental computation skill in many science domain students. If we look at the university curriculum programming and command line and parallel computing are not offered as a formal requirement in the graduate program but people need it so they have to learn by themselves or they attend the course that for example NCI or POSI or Intersect offer to those university sectors. The third gap is that I found the requirement on training is quite heterogeneous. I think that's very obvious because we help we serve across multiple domain the data coming in different formats and they use different workflow and libraries. I feel that the introductory level courses can only go that far to get people on board but if you want to further help people to scale up to the next level to intermediate even advanced I found they specialized the training example and customized training workflow to the domain science is really helpful for people to pick up things right away. Based on those gaps I developed a quite end-to-end learning journey where all the learning components identified in that curriculum from absolute beginner in the stock training to become an expert. Basically we have all these learning components built as a framework and people can pick up the relevant parts to themselves to learn things. Now let me shift to the impact parts and see how how we see that the training help user community and I think the first one would be a great outcome that people feel now I can do it before I can't and then they might suffering from difficulties or frustration why my code doesn't work or why my code take ages to finish. With the training on parallel for example they can run things at scale and of course I think in the end from learning and teaching experience it's all about building confidence and I think training is a good way to help all the students and researchers building confidence adopting new technology and I don't know regardless whether they realize or not by introducing the new tool and new platform we actually gradually change their workflow and if they adopting the new tools they change the new environment they basically working in a faster mode and I found that especially the group who intended to adopt those latest technology and apply to their research domain more quickly often help to put themselves in a more competitive position to have a higher chance to win the grants and publish high quality journal articles in those tier one research articles in a much faster mode and of course I'm passionate about training because we I feel that this is a two-way communication through that training we hear exactly what people's need and we address those needs through the training material we build a trust and supportive community with each other one of the way to quantify those impact is the environmental impact on the job efficiency so if we think about after training people can run 10 times faster than they used to be then it can be translated as an electricity bill that can be further interpolate as how much carbon dioxide emission generated by the research community that is a quantified environmental impact that we've done through the training course now I'm going to shift from the the metaphor of animal picture to the real science here by sharing another observation about the impact which I see that the training community really inspire the cross disciplinary research leverage I will use two real examples to explain what do I mean by that the first one is signal processing so the picture here shows you an oil reservoir where high fluid of high speed fluid was injected into the well and forced the rock open fracture fractures that fracture allow the trapped oil come out of the reservoir when the fracture happened it's actually a small earthquake so the signal is like this it's an impulse signal on the right hand side it's a human body's nerve impulse the signal is very very similar look like the micro seismic event the commonality of the technical issue behind those two different scenarios are that they both suffer from very big overwhelming noise as a background interfering the signal so it also means that consequently the same knowledge or technology of signal to noise ratio improvement can be applied to two different scenario but working on the same type of time series of data the second example is about image if we think about an image in the computer science it's basically represented as a matrix with x and y as the position of the pixel and the third column could be the color scale from 2 to 255 and in the colorful image it's a combination of the RGB but still it's a matrix but to scientist that matrix means completely different things in different domains this one is a brain scan in the medical science and this example is the satellite image and this example is a climate model the paper published in Nature last year the commonality between those examples are that image either have a low resolution or some part are missing especially in this climate model some area like the remote area in the polar or in middle of Pacific Ocean we often suffer from the sparsity of the data and using the deep learning neural network technology we can actually reconstruct those image with a higher resolution or backfill the missing bit so the technology behind are really similar so the main point is that by bringing the different community together and they watch each other and see oh I've used this technology maybe it can be applied to my research and that's I think the motivation and impact to the research community through this training forum now the question is how do we quantify them I didn't see I I did a quick analysis in the beginning of the consultation process and see how many tickets are there and what they are talking about so I used a natural language processing technique to analyze those texts and classify by topic so I identify a number of most frequently asked questions so through that classification I got a pretty good idea to understand what is the daily question from the user support group and then we can address how to help to answer those questions by offering training courses or good documentation and of course I mentioned about system log about the job efficiency that as another quantified in measurement but I also want to throw something here the idea of a knowledge graph how to measure the long-term impact of the training by introducing the new tools and accelerate researcher with advanced training courses how do we impact the whole research life cycle here I use this node to represent user organization brands publication and collaboration each node can be identified with a persistent identifier and I built this model early on when I was a data collection manager at NCI we wanted to see the impact of the data collection so on the left hand side you can see the graph that we built is composed by the creator of the data and data collection themselves and the publication that sides the data and we wanted to see the dynamic ecosystem of the data impact between the people and the publication system and then we augment this graph with international database with ORCID so that we can see further down the track what is the impact of the data collection the reason I want to quote this work is that can we build such a kind of graph to measure the impact through the whole research life cycle by introducing an excellent training program so finally I want to just wrap up with back again with my animal theme today is that I really believe a collaboration is a key words so recently we announced a collaboration in partnership with intersect that they can help us to offer introductory courses to our user that response is I am totally overwhelmed with hundreds of seats or booked out within just three days and I feel deeply passionate about training I saw Claire is there I remember in my previous days working together with Claire offer training courses with people we always get feedback and that's the passion and love that I want to bring to the community of course translating those passion and love into practice I think quality is one of the keywords making sure that the tutorial that we created has quality and that's really make a big difference when we're marching together we also want to make sure that nobody is falling behind especially for people who don't know where to start I think we have a key role to play and help each other with that I'm going to finish with the general email address training dot NCI at AU.edu.au please reach out if you have any comments idea and intention to collaborate with us and by the way we're also recruiting a few training offers as a casual position to help us with that I'll stop and hand back to Anastasia thank you