 Yeah, so I'm a postdoc researcher in the Swedish University of Agriculture Science in far in the north in Sweden, and as introduced, my talk is about open source tools in life science and an example where I started to work with this stuff. This is in imaging mass spectrometry. So my talk will be quite a bit non-IT stuff just to give you an idea of where those things could meet and where there would be possibility for further open source projects, where certainly a lot of scientists, which usually maybe don't have so much to do with IT otherwise, would have benefit if people help with their knowledge in IT to make tools for advancing it in, for example, in life sciences. So my background is chemistry and biology. I work, as I said, in the Agriculture University in Sweden. Sweden has a lot of forest. Forest is a very important business in Sweden. So trees, we use them to make paper recently, we use it to make biofuels, we use to make new materials, all kind of stuff. The idea how this should work in the future is that we want to be able to make trees that are genetically modified, so the actual physical material, the wood, the material that we use for the products, that we can kind of render it more usable for the different applications. This could be, for example, in a potent paper, if you want to make paper from wood, usually you need a lot of chemicals to extract the cellulose, which is this part of the wood, which is of interest to make the paper. So you need a lot of energy and chemicals to extract those molecules. And if we could change by making a transgenic tree and change the chemistry of the actual wood, we could also make the economy work better in terms that it gets cheaper and less dirty to, for example, produce paper. And this applies basically for all those different applications that we could do with raw materials. So we are working on wood, but this is basically true for all kind of other biological raw materials also. You can use transgenic technology to change the actual physical chemical properties so that it gets more usable. And this is not just something which is like future stuff, which we maybe can do in 10 years. This is stuff which is really done. So there are trees in plantar trees which are used already, let's say in Brazil or even in China where you plant those and where you can get a better economy on potent paper. Now here this is just a little bit of a background from a biological point of view. Wood was needed to make this available. Of course you need the genome of those species you want to modify in terms of the trees we work with. Populous tricocarpat. This was done in 2006 where we got the whole genome published openly accessible for everybody so you can go into the database and look up all the gene sequences and which genes are there, what they are doing and so on. If you have the genome sequence of a species what you usually would do then is genetic screens, mutant screens where you usually randomly mutate those species and look for properties that you're interested. This would be the mutant screen. You grow those species then and transform them to those new genomes, to those modified versions and select those that are of interest. This is sort of a daily business what we are doing in the research institute so we are doing those mutant screens and growing the new modified trees and then where I come in as a chemist I start to look at the wood, at the chemical properties of the wood. If we achieved what we want or if we're doing that randomly also to just characterize the wood, what is different, what has happened during this genetic modification. What we call this is phenotyping or chemotyping, it's basically looking at the very molecular structure of the wood and getting an idea what has happened and try to link that to the genes that we were modifying, figure out the function of the genes and get an idea how we could use that in producing other raw materials. Now chemotyping, this is where the specialty technique that I work with which is called top sims, that's a time of flight secondary ion mass spectrometry. This is a technique which was used since quite some time mostly in semiconductor industry actually for imaging of the silica wafer to look for mostly quality assurance but recently so this technique was then also modified that it could be used in biological tissues. What the whole technique is about is that we use an iron gun and shoot with this iron gun at the solid matter so you could see here the solid matter, here the primary particles. We shoot at the solid in my case that would be wood tissue or plant tissue and you get then sputter molecules which fly out of the actual material and those are then so to say sucked up in a mass spectrometer and in a mass spectrometer you could say is nothing else than a balance for molecules which you measure the weight of the molecules and by the weight you can kind of identify which molecule that was, hence you know the chemistry of this solid that you're investigating. The way this is done is by magnetic force so you use the magnetic force to kind of put those molecules on a certain flying path and just those with a certain weight fly then into the detector here and by this you know the weight of them. Now the top sims experiment, a bit more visual, you have the iron gun, you have the sample down here and the time of flight analyzer that is the mass spectrometry analyzer that we use. Here you can see some picture of such equipment. The lower picture would be the actual sample chamber where you, this is in vacuum where you put thin slices of your sample onto the sample stage and what you can see here those probes these are the iron cannons and straight up that would be the entrance then to the mass spectrometer. Now the special thing about imaging mass spectrometry is that you're not just looking at molecules in a liquid or in a solid but you can do that spatially resolved and this is a special thing about this time of flight of this secondary ion mass spectrometry that you actually come to a very high resolution so you can analyze your samples down to the nanometer regime which is very special in mass spectrometry usually you would come down maybe to micrometer and this is where it gets interesting because you come down to a size region where you actually can look into single cell walls, you can look at how they are built, which molecules are there, how are they interlinking the different molecules and you can do that actually visually so you will get nice pictures where you can dig deeper then those pictures are so to say high perspective so you can dig down and get a lot of information about the chemical build of each point here. Now the problem such machines this is a multi-million dollar equipment usually there's not even at every university you have one of those is quite a specific equipment for me as a researcher I'm currently I'm postdoc researcher so I go to a university, I propose my experiments I do my experiments on the machine I get there I cannot really it's not me buying the machine so I'm kind of reliant that I can usefully make use of those machines that are there. Now as I said those machines are mostly used in a semiconductor industry all the software which is available is usually closed source and is also mostly made for the applications for the industry and not really for those few machines that are standing in universities and are used for life science this is still a very new field so it's not the same quantity as a semiconductor industry is buying so the problem is we don't have software and the data that we get is closed source so the first thing I had to do was kind of reverse engineering data formats and starting to make them toolboxes so that we can use this data and work with it as it would be any other mass spectrometry data mass spectrometry otherwise is very common in universities is a very usual technique but this specific type here so it's not very often used so therefore there were no tools available and this is where where the whole open source idea comes in so in in life science we often use the language R there's for bioinformatics a very nice repository by a conductor with a lot of open source software so I started to to make this tool to first read those files from from this top sims machine and then also process it and analyze it and even export it to other format so that you could use it with other tools that are already available the reason why I choose R1 of course is because there's the whole repository by a conductor which is a lot of bioinformatics after but then also is rather easy and quick to make your tools and then it has very good interfaces also to see plus plus as soon as you come to image processing and image analysis is of course useful to to have access to to a bit faster programming languages than than a scripting language the software is now openly available on this by a conductor repository so it's maybe not of too big interest for people like you working with IT but just to say this is kind of a project that went on in about three four years and finally we got the first version out and now we're of course looking to improve and looking to add new functionality and also to get people using it to to get publications out where people actually work on this just some small examples with what this is about so we record images often this is about very basic imaging processing things as I said we have extremely high spatial resolution down to nanometers the problem you end up with often is when you think of a tree when you think of wood it's quite big chunks of material if you analyze in the in the nanometer regime to get really reproducible results you need to take a lot of samples so that can be for one tree you need maybe thousand sampling points if you work with genetic trees you need maybe from one genetic line three to four different trees that you need to resample so we end up with gigabytes of data and you need to be able to automatize those things so image processing in the very simplest form is often that you need to do some thresholding some object detection that you can first take away this data which you not interested in to then have the the rest data that you actually going to do your analysis on often analysis and image analysis with mass spectrometry is a multivariate analysis like principal component analysis to to find those parts in the spectrum which are of interest to then again segment further as we were doing here where we basically do some background detection and then removing the background this was done with principal component analysis where you can see that compared to the first image where we have some artifacts here in the in the vessel holes which we with principal component analysis can easily remove and then can go to a black and white mask to to extract those data points which we are interested in that's basically what I have to show you acknowledgements to the institutes where I worked that was in at the Chalmers University in Gothenburg a little thing which I maybe can spend two minutes talking about this project a lot of the programming was done in a collaboration with Vietnamese outsource IT outsourcing company which was interested in doing R&D also in the open source field and this is this TMA company is in Vietnam a rather big company so they spend quite some money and resources helped me with people especially in the for the C++ programming where I could tell them what I need and they would deliver that that was a really helpful thing and was sort of also a international open source collaboration which which opened quite a lot of doors both for them and for me for for doing things which I otherwise wouldn't have been able to do and a lot of far funding I got from the Swedish Research Council and Bio for Energy is a program in Sweden where we're looking into how we could use the resources the bio resources we have in Sweden in other ways to create more and other high-value products besides just the Poulton paper biofuels where we try to to develop new raw material new products which which create higher value than just the making paper thank you very much maybe some questions so my interpretation of where open source has been an effective development approach has been situations where there are people who are both interested and programmers right what extent is that true yeah I think in biology in certain fields it's quite okay actually especially in the sequencing field I think a lot of biologists they are trained to use bioinformatics tools which are often I think there's a lot of Python toolboxes people use a lot of people use are so I think the affinity to it is quite high actually and people are open to use it but as soon as it comes down to to make solid tools which you when you have a lot of data processing involved and when things should be clean written and stable and fast and you actually need people to help a lot of things start as really prototypes and then get improved sure yeah sure I have different exposure this with environmental monitoring right and certainly a contact with the environment agency and it's clear that it's not actually any displeasure at all it's just that for them they are environmental scientists and it is somewhat yeah yeah yeah I think in generally in biology the advantages that you have the field of bioinformatics which is rather big bioinformatics is I think is a lot of open-source activity going on most of the projects I mean are by definition the funding is from the state so the results have to be published results have to be open and it's I would say nowadays it's kind of if you apply for money you need to disclose that you will give the source code to to the public and that it will be open so this is actually I think on the good way it's going slow but it's on the good way I think yeah definitely that's Sweden Europe how wide is that sort of pressure you mean research wise or yeah at least for me the those grants or I apply now not directly related to software but in general let's say open open journal by those grants that I apply I'm forced to or forced I mean I have no problem is that to to publish in open journals that the results that I find that they are available to everybody so a part of my research budget goes to paying the money to to publish in open journals which I think is great I mean it's a whether it's a good idea that you need to pay to publish in open journals is another question but basically that you have to do that I mean that the results get open I think is very important and especially in those cases also with those machines I mean there's a lot of data out there available you can just download it but what is the what is the help if it's in a closed format so all those tools they also need to be available of course if you can download experimental data but you cannot use it because it's in in some weird format that doesn't really help and there may be it's not yeah yeah right right yeah I guess those companies selling those machines they're used to a completely different business so there yeah it's actually also know it's maybe three companies in the whole world producing this machine I was when I did postdoc in Japan where one of the companies is and I was at the lab there and more or less tried to convince them to kind of open up the data format but well Japan is probably another story then that it's difficult to get to talk to the right people to do such stuff but there was yeah there was no real interest in doing that yeah it's general plants is not just wood yeah now this very specific technique as I said this is rather it's rather unique there's not too many universities who actually have it is getting momentum now is often having very high spatial resolution is of is of big interest of course you can in bioscience as soon as you can start to look in single cells you can you can look what's happening in one single specific cell it gets really interesting and this is where this technique probably in the future will grow much more and then I hope that was some sort of starting point software wise where people can start to build upon also to yeah to work on the data and as long as it's so expensive to have such machines you maybe don't want to have the machine at your university but you just go somewhere do the experiments and at the moment you could not even look at your date at home because even the software you need to pay for licenses so now when you have those toolboxes you can at least kind of take your so your data home and work at home with with with your results so this is kind of the that was my idea which I even needed that for me because I as a postdoc you're just traveling around and working here and there and cannot really bring with you the software thank you yeah yeah it's actually let's say in the there's as I said there's three commonly used machines one from Germany one from Japan and one from France and the one from Japan I think as long as we keep it rather low they don't know probably about the software package but yes propriety format and this is reverse engineered but it seems to work I have to admit I didn't spend too much time scratching my head about legal consequences but it works yeah you have not been given any information on the NDA no no no you look at any documentation from the supply the difficulty is going to be that the institutions are NDA and your agreements where they are one of those NDAs no I this I don't have I was really I mean I got the files and I yeah we worked it out ourselves good okay so I'm safe good to know thank you