 Hi everyone and welcome to the early of the mid-career keynote for the research software engineering Asia-Australia conference. I am delighted to introduce Jess Ma. And I will read the very official and very formal introduction for Jess. Jess Ma is an associate professor at the Australian Institute of Bioengineering and Nanotechnology. Jessica spent six months in the United Kingdom working for the European Bioinformatics Institute before heading to Harvard University in Boston to undertake her PhD. Later she started her own lab at Albert Einstein College of Medicine in New York, USA. She has won the Metcalf Prize in recognition of her leadership in stem cell research. Her research group focuses on the development of bioinformatics methods to understand how regulatory processes go right in human diseases. The explosive ability of big datasets coupled with the speed at which sequencing technologies have advanced have created an exciting environment for the current state of computational biology research. The Ma group looks to modern tools and statistics such as Bayesian methodologies and machine learning algorithms to make sense of biology from big data. So that is the official introduction. The personal introduction is that I've known Jess for a very long time. I have even had to look at her code at some point. I will tell you that story when we are not recording. And that's just the R part. It's not that you're okay. You're okay. It was actually quite easy to understand. And I really, really, really grateful for Jess today and I will hand over to her to start her talk. Thank you so much, Jess. Thank you, Roland. Thank you, everyone. It is really wonderful to be here. Thank you for that really kind introduction, Roland. It's been put to me that my official bio is a little bit stuffy, but you made it very approachable, I think. And yes, Roland and I go back a long way. And so it's actually really quite meaningful and interesting to be here giving the mid-career keynote, but more importantly, listen in on some of the discussions and topics that you're going to discuss in this conference. I think having met Roland early on in my career and having gone through different transitions all in academia, I think, where I'm certainly at an interest in crossroads and thinking about how do we create more opportunities for people in this area with this sort of capability. And also, you know, there are no easy answers. So I would love to hear your thoughts and questions and have a great discussion independent of just me talking. So I'll leave a lot of space that we can talk together. So hi, everyone. I just also wanted to flag that I am coming to you from the Gold Coast in Queensland. And I'm actually running my own conference here at the moment. So it's not the most ideal setup. I'm in my hotel room and I hope the internet can sustain us. So let me know if there's any problems though. I gave this the title of my talk just really because it overlaps a lot of with the kind of research that we do in my group. So it's a bit of a statistical pun. But I think also that there's a little non normality in everybody. So I hope that there are things that that you can take away or I certainly have a lot to learn from all of you. So I just want to start by shouting out some kudos to the research software engineer visibility is a really meaningful concept. And I think that's probably why one of the reasons why we're all here today. Funny story. Roland approached me a while ago and he said, I'm putting together this conference and would you be interested in giving us talk and I had a look at the conference spiel. And it, it, this is this is it Roland did not pay me to say this but I actually had no idea what a research software engineer was and in reading the paragraph and I thought it was so well written and I'm sure Roland Apollo you guys put that together but it is, I think such a wonderful delineation of the kind of people that I've worked with, and also someone who I think I've become as in working in bioinformatics. And I think this part you've all captured really well as this trouble of defining their role and value within academia and academia is is a very well defined hierarchical space. And I sort of see over the years, people who would subscribe to this label of being a research software engineer, sometimes getting lost and sometimes not getting the kind of promotability or visibility that is is congruent with the skill sets and the contributions that they bring. So I think this is a fantastic idea to get people together to talk about some of these issues. You're a big believer and you can't be what you can't see and, and I think in reading this paragraph my jaw just completely dropped open and I felt really seen. You know, and I hope many of you feel this way that this conference has given you a platform where where you feel seen and recognized to because I think that's a really important step to doing the great work that we do. I'm going to talk a bit about I'm not sure again I'm not at my office so I don't have my dual screens is the zoom panel blocking the top of the slide. I don't know what it's fine okay cool because I don't know what I see is what you see but anyway, I'm actually blocking it for me so I'm going to move it a bit okay. So just just a highlight is as really mentioned, I am a research academic at the University Queensland. I run a research group that largely does bioinformatics research. I'm guessing I saw in the chat that there's some fellow bioinformaticians out there and I certainly recognize some names and faces there so it's really great to see you. And most of you who aren't aware bioinformatics is a discipline that combines statistics and computer programming with the goal to really solve complex questions in biology. And one of the other things that I think bioinformatics makes bioinformatics such an attractive capability is that you learn to work with big data sets. And science in the last few years has become a very strong buzzword. And it can mean a lot of different things to different groups of people. But certainly, I think the real thinking and engagement with the messiness of big data and trying to use statistics, but also using programming because the data is so large to arrive at actionable conclusions. So I can move forward in science. I think that's really capturing what we do in bioinformatics. So in my career. I've been working in research for quite a while now. But the kind of contributions I've made have been in different areas of biology so stem cells, aging cancer and neurological diseases. And everyone alluded to I used to love coding so I started my career as a undergraduate in statistics and mathematics. And I used to really love the fact that you could just write lots and lots of our code and get insights into data in ways that other people really, really couldn't. So having said that I am a group leader I'm an associate professor at a university. And I'm based at a research institute so full disclosure I have a fixed term contract in three years time and be looking for another job. And so what that really means is that I now love mentoring people who love coding and sadly, the amount of code that I write these days is very small and for embarrassing applications like exam questions and not for research. But that's okay I mean I really love working with the people that that work in my group it's it's been really exciting and challenging but also stimulating to try to help them and get them set up in their careers and I think some of the points that were alluded to in the discussion about creating sustainable jobs. I think is a really, really important one and one that I think we need a lot of input and action on quickly. So, just to go back to some of the work that I do. The title of my talk is is really inspired by the kind of research that fundamentally my research group has been focusing on for the last 10 years. And it's essentially stepping back from this question of assuming things follow a certain way. For those of you who know data will know that that takes about five seconds before we throw that assumption out the window, but really effectively made a career out of asking what's, what's not normal, you know what if it's not normal. I'm talking about the statistical distribution, the normal distribution. And so just to remind you, I'm sure you're all familiar with this bell, bell curved shape. But in statistics there are actually other kinds of distributions that we don't often talk about that these have similar have the same sorts of mathematical kinds of definitions and statistical properties that we all know and love the normal distribution to have. But as you can see by the diversity of shapes, there's different ways to model data and there's different things we can get out of these distributions some are really easy like these bimodal distributions where we can get different sorts of different sorts of sub clustering. But others like the log normal and the Pareto are really good for modeling asymmetric data. So, in bioinformatics, all the way back in 2011. This was really the first paper that set my group apart and and ahead. The idea of instead of looking at the average expression, which is what most people do when they look at gene expression data is to think about using the variance so while variance is a property of the distribution it means instead of looking at the center location we're now looking at the width of a distribution and proposing using this as a regulatory parameter to understand how diseases work. And this was a proof of principle kind of question, working with actually some of Roland's former colleagues, and we found some really interesting things and and from then on we started to look at this question more closely. What happens when we don't assume a normal distribution of the data. And I've had various students who have taken this idea further. I had a wonderful MD PhD student Dr Daniel PK, who was looking at bimodal gene expression in breast cancer patients and found some really interesting subgroupings there that helped us identify some new regulators. I've got a current student Malindri Dalmaratni who's about to finish her PhD. And she's been looking at the prevalence of different sorts of shapes and single cell RNA sequencing data. And also what that means. This is also quite a big departure we don't tend to think about modeling gene by gene individual distributions, and she's finding some interesting things that really challenge the assumptions that we all make. I've been working with a wonderful student Ebony Watson, who's just had a paper published in briefings and bioinformatics where we're trying to understand what does distance really mean and in the context of these high dimensional maps where we're trying to understand data that has tens of thousands of axes. And what can we learn from that because again there's lots of assumptions there at play and we're finding that, depending on the data structure there are different conclusions that we can make. But I'm really here to talk more about how do we empower research software engineering communities. And so I thought it might be insightful to share some of my origin story, in terms of how I came down this path. I wanted to start out this part by saying that I really slept my way through a first year undergraduate engineering math course. And that was because we were taught as probably many of you were, or maybe nowadays they do it really differently, but learning maple and MATLAB and so these really, you know, mini tutorials on how to learn how to code for the first time. And that's why I slept my way through because at the end of this, I did not know how to code I didn't even know I didn't know how to code. Yeah, and it wasn't until I went to we hi in Melbourne in 2000. And I was fortunate to get this summer internship working with Jean Yang. I can't remember exactly the project. But she effectively gave me like a set of box plots and said, you know, I've got the software and are that does a sort of analysis but I've got these box plots and I, I need you to basically flag when there's like dots, like outliers in the box plot, or where they are how far they go or something like that it was super super super super simple question like you could see the dots. And it took me a few weeks to learn our to get it to do what I needed to do to get it to do it correctly. And then complete that loop in a timely way. And I remember those few weeks were the most exciting, you know, weeks of my career at least at that stage. And it really kind of reflected how in insignificant or unsubstantial as budget that's the right word. The undergraduate training had been, and it was a real contrast to actually have the chance to sit down, focus on a specific problem with coding and orientate my learning around that. And after that I really caught the research bug in a really big way so I went on to a new for the following summer, and what was Professor Sue Wilson there. And it's also this idea of being lucky and also being open and available for opportunities. And so while working at a new they had struck up a collaboration with a startup company bilateral in Sydney and so these guys were basically doing the first set of bioinformatic courses around the idea of learning how to analyze at the time of microarray data. And so they invited me to come along with their teaching team and so for a year. I was really fortunate to be able to teach courses around Australia so effectively on how to use our to do basic statistical analysis of micro and microarray data. And that was both a really exciting way to engage with researchers and understand the kind of problems that they wanted to solve. But for also, for me who was really I was still an undergraduate then it was a really formative sort of experience and learning how to teach. And at the time, I remember one of my colleagues who was part of bilateral who I was co teaching with and, you know it's quite nervous and insecure being an undergraduate lecturing in these courses when we had participants who were professors elsewhere. And he said you know you really just need to know the right amount to teach something. If you know too little obviously that's not going to work. But actually if you know too much sometimes that that doesn't mean you're going to be a better teacher and I think that was a really important message that I've carried through my career that you just knowing stuff doesn't make you a good teacher and that there's other ways that are actually really important to pay attention to to get that information across. After that I went for six months to roll and mention in the intro to EBI in Cambridge England. And I worked with Albus Brasmouth there working on some software. I then went to Harvard to do my PhD and I went there specifically to go work with Robert gentlemen who I'm not sure people are familiar with who he is, but he is the co creator inventor of our. And it was my dream to work with him and being at Harvard also meant I could do some other things so I taught our courses at the Harvard School of Public Health. And also at Cold Spring Harbor in New York where if you know genomics is basically the mecca of molecular biology and genetics and genomics. So that was a really fun sort of experience. Along the way as many dreams sort of don't come true, Robert actually left to move to Seattle, but in the process I was able to transition my PhD to John Quackenbush who is also a huge person in terms of software development as well. And ultimately then finish my PhD in 2008. So I think well obviously there's a whole other chapter that went on afterwards I think for me, sort of developing the kind of skills that that I needed and sort of mindset to do the research I do. I think all of this here is really what sort of was very formative. And I think for everything that I've learned. Sorry to keep moving this around I hope it doesn't annoy you. For everything I've learned I think the conclusion, the one conclusion I can stick to is that science needs everybody. And along the way one thing that I've really enjoyed doing is to to run hack the buns. And I think they're really inclusive kind of environments that bring out a lot of the fun and creativity. And that is inherent in the kind of software engineering research that we do. And so I ran a stem cell hackathon, a few years ago before the pandemic. And that was really focused on trying to get more people, especially people who, for whatever reasons may have shied away from programming to get them involved and and we use stem cell research as an anchor to get people kind of focused on a specific scientific topic. And the real goal was to break down barriers to data science capabilities to try to, you know, break down that sort of phobia people might have about programming, and just develop that approachability accessibility so that people could see that it is actually a lot of fun to learn how to program to use programming, when you've got a really central important question to solve and a team around you. So that was that was really, really great. Before that, when I was still based at Albert Einstein College of Medicine in New York, we also ran a hackathon there but that was based more around aging. And again, it was themed around getting more people to have a go at programming to come along to work with with different kinds of disciplines of research. So it was a huge success. So, I haven't run a hackathon since but I would definitely be open to it. I know it's become a much more popular medium. Since I first tried in 2017 and so it's really great to see that becoming more popular, especially in our community. Oh, sorry that there we go. So I think also another thing I've been involved in is trying to see what age we can foster sort of this critical thinking and data science mentality. And a few years ago, I did a pilot experiment at the ABN where I basically sat one on one with kids to see if I could either get their curiosity up around data. To see what they thought about data science related questions and also get them to try out some preliminary skills and in data science, whether it was making plots or writing codes or coming up with other ways to engage with data and as you can see with all the pictures here it was a really great success. So I'm in the process of trying to expand that out to a high school program. It's a little bit old because we weren't actually able to debut something in 21 we did get funding actually to take this to the next level but because of COVID we weren't able to take it out to schools and the funding program weren't happy with that so we weren't able to move forward unfortunately but I think maybe there's a possibility now that things are starting to come back. So I think that research, you know, regardless of the kind of specifics and the technicalities of the research you do, for me in my career, research has been a really wonderful outlet for curiosity. And so it's really not just about the results that you generate or the papers you write or the software programs that you output. For me and I think for everyone like it, it's a really powerful kind of learning experience. And it's really built you know that sort of personal growth. It gives you networks that move your career forward whether it's intentional or serendipitous. And it also develops or really challenges you to grow your own self awareness. My career is a tough environment and I think understanding yourself understanding the limitations and strengths who bring a really one way to keep moving forward. And I think regardless of what career path you go down whatever job title you end up with. I think developing this, this outlet for curiosity developing these sorts of properties that you have. I think are really important so I put this together actually for undergraduates a while ago and I thought it was a kind of a cool bingo card to to keep track of you know I sometimes in the busy day to day lose sight of some of these questions but I think it's important to keep them close. You know just to keep checking in with yourself and understanding you know what keeps you up at night. What do you really enjoy the most I think as we get older as as we've become more grown up. You know we send tend to lose sight of these things that are fun or enjoyable or inspiring. You know so I think there's value in just making note of what stories resonate you. You know who do you talk to that you're really intrigued by. Is there a gap or disconnect that you really really really want to learn more about and using those sorts of questions to guide where you're going internally. There's always value in that. So, I think the other thing that it would be irresponsible to to acknowledge is that, as I said, academic academia is a tough environment and because of that researchers need to be resilient. Roland was talking about triggers earlier and I think there's, you know, not enough acknowledgement of that or not enough conversation perhaps around that. But nevertheless, you know there there is a lot of kind of resilient people walking around. But I highlight that not not because I really want to hop on that but the fact is we often don't talk about the fact that researchers need a lot of other skills I think to make science goes smoothly. And so we think about you know research being built on very precise set of skills and programming is certainly one thing there's no wiggle room around you know calling a function or working with with a data variable. But I think sometimes we lose sight of, especially in our trading of others is that things like compassion patients, you know effective communication, respecting others being empathetic and understanding that you know people may be on a different path. I think if we could bring more of that into our research front and center I think in many ways some things would improve. And to me, I've got two models that I really like to sort of lean on in various times of my career and they both come from a remarkable artist, Marie Andrews Andrew. And she's got an Instagram and she's got various books and things like that. But you know and and I'm sure you've seen sort of variations of this and some sort of capacity but this idea that you know what you see is the tip of the iceberg what you see sharing being shared on Twitter or, you know, for people that are doing well it's it's and and we certainly celebrate and applaud those efforts. But behind that there's all these different things. Yeah, and you never know for certain people how deep this iceberg goes right or what parts of the iceberg are really cutting into someone's life and I think it's, it's it's hard. And that certainly, you know affects how effective we are at work. This other idea of resilience, you know, and, and developing strength and self awareness and all that sort of superpower that we call upon, especially to be leaders, you know, I heard those talk more about how to empower senior overseas and I think as we move up in in the career ladder, you know, we do take on bigger fights. And I think we often focus on sometimes so much on on the negativity of those fights without stopping to celebrate how far we've come and the kind of fights we've we've actually been successful in navigating. So let's see. I know that time wise for sort of running short and I don't want to, you know, slow down the time too much but I think if anything we can sort of talk together about how do we support our RSEs and and I think that's what the afternoon sessions are also for and I apologize I can't be part of that because there's no difference here but it sounds like you're all set up to have a really great discussion but some other things that I've been thinking about is, you know, academia is a bit like being on a roller coaster and in some ways the chaos is a bit like being at an amusement park. And so not not to get too deep about it. But, you know, some of the questions that I've been wrestling with lately is, are we in an environment that supports and values problem solving. You know, I think the kind of work that we do is is is so important for developing solutions. But sometimes that's not the same thing. Right. In terms of what's valued in our day to day jobs. Obviously what is the future of academic positions specifically research positions in a university framework. How do we celebrate achievements you know I think with other industries that are more established there are certainly you know more prestigious kinds of traditions that people follow that allow us to celebrate quite naturally you know achievements from our colleagues. But you know when you're a new industry. It's, it can be hard right and I know Roland's trying to create different prizes and things like that and I think that's also really important, both for celebration but also visibility. One thing I wrestle with a lot is how do we develop critical thinking skills, especially with different generations who have different sorts of training sets and and and kind of ways to think about the problem and I think they're all valuable. But how do we sort of corral that to to develop, you know, excellence in critical thinking. How do we build identity and credibility as a group, I feel like the attention span for so many things I mean we saw that with data science that can be difficult, I think to engage into finding control. And for grant writing as a research academic. This is my, you know, greatest bugbear and and how do we navigate those cycles for those of you who have been grant writing in Australia for the last two years. We also understand that the road has been exceptionally bumpy, and it's hard for us but it's also hard for our ECRs you know I saw that the ASC just opened their DECRA call for next year but my understanding is the existing call has not even been announced like the those people have heard the outcomes. So I think that's hard like how do we mentor our ECRs to do these sorts of things when so much is at stake like contracts, research, all that sort of stuff right so there's a sustainability issue grants in Australia really in these three years, five years of fellowships. And when we think about that with the cycle of a PhD student right, it's just really hard to create you know critical mass with those kinds of timelines. And there's also an equity issue from lots of different dimensions how do we make this grant writing experience, you know one that is more aligned I think supporting our RSEs. I think a big topic is reproducibility and research. And this last one here is how to foster mentorship you know other, other disciplines have sort of got different programs and different streams of developing a mentoring kind of community and I think you know all those have pros and cons and and I think it's interesting to talk through, you know the limitations and the successes that we can sort of mimic. I wanted to end with this sort of last I know that side was a bit negative. But I wanted to end with this slide so as I mentioned at the beginning of my talk, while this term of research software engineering is is quite new to me. I feel like you know it's something that I've been entrenched in for my career and it struck me so recently I've been mentoring different kinds of students supervising different students and a medical student who wanted to do a research project. And he actually had worked at a software engineering company for quite some time. And it's really interesting because he is asking lots of great questions, you know about some of the research that we're doing. And a lot of it's based on, you know, but we don't have that we don't have this standardization and we don't have this framework that's version controlled or we don't have this. And it's really interesting because all those are really important points. You know I've got another student who's an undergrad CS major and he comes in and he says well you know we don't have version control but maybe we could do it this way or we don't have this but you know maybe we could you know check this one and that one and he's just just brimming with ideas you know and how do we navigate these kinds of problems and I think to me that sort of seems like the value that fostering this kind of community brings because you have to demonstrate out of the box thinking and be creative with the solutions that that you you put forward because it's different from a formal or I don't know what the other word is a commercial software engineering environment where you know there are much clearer sort of safeguards or things that are put in place. And I think that's really great because I see people getting really paralyzed right and not being able to know what to do. And I think having that kind of research mindset means that you see those lack of safeguards and you see the goalposts and you figure out a way to get there. And so one thing I've also seen which is I think really commendable is this commitment to building new research infrastructure right there. You know we need to get to point X but if it means we have to build a boat and and the order to get us there then then we'll do it right. And it takes a certain special individual I think to to see that and also be willing to do that yeah. And obviously there's a sense of fearlessness that that comes you know when everything is not completely defined because I think a lot of people do get scared when they see that you know it's like well what are the variables how do we not know you know this and this and this but when you think about it you know life research is is completely defined right you're never going to be in a situation where everything is locked down and documented and you know controlled and tested so actually it's much more realistic I think to be in this type of setup where you have to solve these kinds of problems. And I think over the years working with different kinds of research software engineering people. They've taught me taught me a lot of things about you know the importance of reproducibility and everything else that has to come around that. But this commitment to standardization you know I worked with some of the people who did all of that early foundational work in bioinformatics with setting up the microarray. And this idea also of accessibility you know we live in a time now where open access is almost a given. But I still remember you know lots of journals like behind paywalls and things like that and so data and not being shareable and things like that so I think it's really important to not lose sight of that. And also longevity you know how do we create resources that will stand the test time will be there for other people when our jobs are not there or we have moved on to other things. And I think the thing that I've seen the most in my career and watching other people as well as that it is a training that requires a sort of style of apprenticeship right so we mentor people we bring them under our wing we give them projects to to work on skills, and I think that very naturally lends itself to this idea of mentorship. But then how do we take that on a bigger scale and build a really robust supportive community. So with that, I'm going to end my talk and sorry for going a bit over but I have to acknowledge a lot of people my career. I lived in the US for 16 years and through that process had lots of really people are really wonderful people that I worked with. While being in Australia for the last six years I've also been really fortunate to have some really amazing researchers to work with and also and hopefully able to hear from you and any thoughts or ideas or questions you might be having more than happy to answer them and address them. So, thank you very much. Thanks Jess.