 I'm a computer scientist and I have been working in the area of medical informatics for years and obviously brain disease is one important research area become more and more important but the transactual research on the brain disease is really required a lot of data particularly now in a clinical study this is largely due to the maturity of the image technology so we now have in a highly constrained environment we have the FM machine the clinical institutions but also also have the EEG become more and more pervasively used but also in the brain disease research the body sensor play a very important roles to monitoring the people's behavior and record this one as clinical measurements and all this data will combine with molecular level profiling like we heard this morning a lot for example the gene expression and all this information will combine together to constitute the study of clinically meaningful biomarker which is yesterday Michael gave a very good keynote speech about it so in order to utilize this massive information we really needed to do things one is you have to build infrastructure to manage those data this data is the integration between the clinical information as well as the molecular level of information in the clinical information level there are different kind of modality provided you the different measurements of the phenotype all these data should be put in together and also the second one is you really need a good analytics platforms so therefore all this data can be started but very important thing here now is by the clinical clinicians by doctors so this is really the goal of the each project which now I'm leading is the European translational information platforms which have the goal to build in five years time now already in a year to to have a common platform to manage and analysis the two buildings are my projects so you probably people know in Europe we have invested I am I research initiative the innovative research medical research initiative so pharmaceutical company with academic institution work together on the various diseases and they generate a lot of clinical related informations clinical data plus the molecular profiling and all this data you'll be measured and actually be managed by a single platform and then we hope to do analysis on this platform so obviously quite a number of the disease we are now dealing with is the brain disease so therefore this platform have to have the capacity to analysis the brain disease as well so therefore the extension of the traditional translation research was largely focused on the clinical measurements integrated with the molecular profile to extend to different kind of modality for example here a particular mention here now is you have the image processing and image analysis workflows also you will have the real mechanism to deal with the patient records as well dealing with lifestyle measurements use the body sensor all these are quite important a phenotypical information and then of course as same as the other disease molecular level profiling is very important and what you're doing here now is you organize with certain ontology provide a flexible data model and drive a real good advanced analytical workflow so today I will just talk about this to one is the data integration another one is that some analytics we develop so taking one example I know we this is a project we are doing now in the Imperial is the is the model scores and this is a typical sort of the data integration structure as you have the patient data you monitor as a collective life here and now you have the image data here condition provide the really the historical clinical conditions as well as the information about all the patients can it basically the patient records and now you nurse monitoring the treatment as well as the following the patient for the longitudinal study so all this information will be rolling through the clinical report forms and the two really to the database in the e-tricks so that is the data and also all the t-shirts here you will put in the Biobank at this t-shirt will be used for generic molecular profile so the integration is really in the big summary we integrate these four parts so transmart is used as shown just mentioned as a core molecular profiling repository for e-tricks however it's be extended to using the different databases for managing a difference mortality of information for example the Xnet was used to do the image data management so what we have to do is that we have integrated all this information together to form a uniform a repository for brain study here is particularly you will see in integration now Xnet was one part of the e-tricks platforms for medical image storage but we put the morphology information inside into the cd into the e-tricks into the transmart based on using the pathologies come from the c-disk so basically clinical data standard clinical research data standard so in this way we allow people to actually ask this kind of occurring for example you brought me a score for for multiple schoolers and this is multiple schoolers I call it for life score and then what you're doing here now is you see that for this guy for all those people after the treatment and still experienced a couple of relapses and also show some reason of the white matter so you can actually search for example all the images or the people's and within such framework so that is the really the basic function you need to support for really the clinical study and the data model is actually quite quite simple with have this kind of structure and the one is you have as you can see the data was organized based on seed extended and then you can see on the medical across the level you provide the necessary information of the patients and then on the morphology level you provide all the information about the the brain structure and then you've got a device information as the provenance and have images and on the X-net linked through this all indexing to the each X platform so this is the quick screenshot and see how the system actually work what you pay attention to here now is this ontology structure which is a really compliant to see this which allow you to navigate to the repositories with all the four piece of information medical records morphologies and the treatments as well as the other patient lifestyle information from this structure and that's what give you the answers and then you can drive the analysis packages workflow to do all the analysis so okay the next part I talk a bit more about analysis I think this is also quite important part so each checks definitely adopted all the existing methods for for image analysis particular focus on fmi because this one is clinically used intensively in the in the clinical study so here the basically we use the fmi analysis and is actually probably what we all know another F2 sites one is you maybe people called brain mapping you provide the stimuli and then you identify which bring part of the brain was take the response and the other one is the reverse right this is more like you build a model here and here now is you build a predicted model you get you know the brain activity and you can predict the status of the stimuli so that is the two sides but what we are doing this both sides I have a very big clinical application what we are doing here now is a way in hence the methods to support efficient analysis in environment so I don't need to take too much about the image analysis I think here is the expert community here but just quite the overview the basically use the brain mapping methods is really like like SPM is really very much based on the idea of the general linear regression here this is about the signals here is design metrics which is providing you the really the experimentally the vectors and then this give you the regression actually the parameter which is provided you the correlation between the both signal as a particular signal this one is designed by convoluted the actual homodynamics models with actually this realize so that this one you can really use a test you can find out the significant beta which is related to the signal which which actually means which particular stimuli is related to this particular border signal so this idea can be actually extended straightforwardly to actually doing a simple test in the clinical environment so this is the one of the work we are doing here now is actually for MS we want to verify this hypothesis so the patients with the with the MS still have the preserved brain plasticity so we actually did experiments it's kind of activity you scan the brains before the treatment and then two weeks later you do the same test after treatment and you compare the image so you can actually do simply use the linear regressions general linear model actually do to compute a difference and find out the big delta x and which find out the related brain a voxel and identify which one is caused a significant change but the issue here now is how to detect a significant change to voxel and we actually standard away with basically just use the linear general linear models can be directly used and then the status test but then we find out actually we have immediate problems when dealing with the clinical really the image one thing is when you see in the previous this one right after you really aggregated actually the noise and the signal noise ratio is actually become the even lower so therefore you really have the problem of the very noisy signal the second one is the many idea when you do the unit test and you based on idea of this independent of the voxel is not really realistic so in this way we have to think about a new methods which is the one is can be computing can be efficient the second one we can avoid the problems of the two I mentioned the low signal noise ratio as well as the unrealistic assumption of independence of the voxel so what we actually working on here now is we are not really talking comparison voxel one by one or we take the whole voxel metrics to do the to do the real the optimization this optimization will take the actually in the in the machine learning right we talk to this core and really the optimizing this regulated object function so we are not just interested on minimize this residue that's the usual way you do the girm but what you really care about is to another important the property of the model the one of the model here now is a sparseness so basically you can imagine when you have the really the two status only few voxel and all only few bring which will actually change right because most of them have no change so this cost this delta is actually should be quite sparse the second one we should really think about is this called a smoothness probably can talking about is modern dependency between the between the voxel between the region could be two way right wide between the voxels another one guessing about between the region and here you can do a simple way in our work we just simply assume the neighboring sort of the voxel should act similar but you can actually now really use a much better way of thinking about us to take into consideration the the structures of the neural so you can compute actually the connectivity metrics use this one here so this is give you the really the optimization function so what basically say you will find a model rather not just have a small residues means the small era and but you also have a sparse have a sparse enough right only few voxel you pay attention as well as you look into the smoothness and then I open as entire function but reversely you can say is a this lambda this is the penalty variables actually provide you the actually trade-off or balance between the importance of the smallest residue small era or the sparse but in another way you give a lambda there were different kind of the delta xi be computed right so what we are interesting here now is a delta xi which can survive the most of the lambda what I mean is given each delta x you will have a lambda corresponding to it you have one a sequence of lambda right started from very big one to small one the big lambda will make this one very sparse and the small one will make the model less sparse but what you're interested on here now is a the variation of the effects such that is actually survive in even very sparse model so that's kind of the variation you really pay attention right so this is probably you mean that voxel is a really have to make most second change so what we do in here now is instead of using the instead of using the t-test or using the status test plus the actually the p-value as the cut and what we're doing here now is we use the lambda directly as one of the new status variables to detecting what is the important change of the effects so this one give you one example when the lambda is very very big all the delta x are zero so the model is a total response and I started this number and suddenly you find when you reached a nine this particular delta x is now not zero and then reduce further this three not zero I released this one is all them not zero so you can see for this one the lambda associated with a is nine for this one associated with a six use this number as a measurement to measure stability of the variation around the sparse change so this is quite good it's very simple algorithm you can run it very quickly but then afterwards you do the permutation test you find the most the significant significant cr in permutation text and you rank that then based on two value one is the big value the number of the lambda as well as the statistical p-value together and you get your rank so in this way we can actually do a test which we find out and it's actually very successful and it's certainly better than the traditional gm methods and but also better than the this TF EC which is a imperial now there in the library they use this methods and now our methods is are performed very nicely the the next one is a quickly go slow is the prediction and the prediction side in our model we build the workflows for multi voxel pattern analysis so this idea is pretty simple you give a stimuli and you find out right the corresponding vectors like what I said in the brain mapping and afterwards you run multiple stimulation and you labor right or the or the voxels which is have the response and then you build a predictive model or classification model when you have an input you don't this is the brain reading you have one of the voxel pictures and then you can actually find the brain it can predict what kind of status is actually the stimuli so that is the brain reading model here so what we are doing here was still working on the linear sparse model for this work but as we said before right because of the brain mapping model is is highly sparse so therefore the linear sparse model is a very good choice for working on that right but we still have some difficulty here so the difficulty here now is some precise so no matter how actually how much how many money you have got how many scan you can do for the brain you still have a very small number compared with the voxels but fortunately because it is a sparse matrix so we can use the compression methods to overcome the problem but still even in the compressing you still need to talking about the sample number should close to this much so s is the non-zero right so now non-zero voxels within the within the no change voxels within this p so but it's a number is still big okay so in this case we were thinking about how to reduce this number so make our recovering algorithm is better so in this way we did a work okay in this work we did a bit quick okay in this work we did a sub sort of ensemble learning basically we cut to the actually the voxel sets into subset and then the sub model afterwards we rank it so but in this model here we have a big problem here now is called a stability so yesterday we mentioned about the reliability of the biomarker remember we are actually looking for is a biomarker is is the voxels which is correlated to the response so in order to have the meaningful biomarker you need a stable so basically means during the cross-validations and all this no matter how you cut it all this voxel on the sample all this voxel should stay there in all the models so usually people do the model selection only care about the predicted performance in our case we combine the performance with a stability so this is another very big work we were doing to developing this workflow for finding the stable biomarker in this case response voxels in the images when you do the clinical study so usually people don't really care too much about stability but this is a very good issue so I won't be able to have a chance to talk too much about that there's a paper published in the in the journal you can have a look and you can see here our model is a tradeoff between the hyper high predictive position but also with actually a very good stability so the order voxel inside is survived in the order cross-validation so I think I this is the the conclude my talk so basically we build two things we build a repository now to actually have the medical image managed claw aligned with the clinical information and so people can answer the clinical relevant questions with respect to the the image study the second one is with developing many workflows to actually do image analysis all these workflows and somehow in hence adapted the traditional medical image processing for actually neuroscience study to the clinical context when you deal with a large big of the big image repository so that's pretty much my talk thank you