 Vaccination saved more lives than any other medical intervention. However, we still lack vaccines against many pathogens, such as HIV and Hepatitis C, with devastating consequences. Hepatitis C, or Hep C, for example, is a leading cause of liver cancer and liver transplants. It infects an estimated 71 million people worldwide, 9 million right here in China, while in the US it kills more people than any other infectious disease. So we urgently require an effective, widely available vaccine. So why is it so difficult to develop vaccines against viruses like HIV and Hep C? Or to help answer this question, we think about genetics. The principle of vaccination is to train the immune system to recognize and cure viral particles with certain genetic codes when natural infection occurs. However, viruses can change their genetic code by making mutations when they replicate, and in doing so, it can escape the immune pressure. This is only a problem, however, if the mutant viruses have what we call high fitness, which basically means that they can effectively replicate and infect cells. Now, the big problem with viruses like HIV and Hep C is that they make loads and loads of mutations, which creates many immune escape opportunities. Even if those escape mutations lower the fitness of the virus, often this fitness can be restored by mutations occurring elsewhere through some sort of collective effects, which make it very complicated. So what we require are vaccines, which can elicit immune responses such that the mutations to escape those responses both lower the fitness of the virus and this fitness is not easily restored through other mutations. So in order to address this problem, we first must understand how the fitness of these viruses depend on their genetic codes. Now learning this in the laboratory is very complicated, since it would require synthesizing and testing the fitness of billions of viruses with different genetic codes. So in our work at HKUST, as I would describe, we've been looking at a different approach of solving this problem using big-data machine learning. Now our work is inspired by some recent breakthroughs which have used data analysis to help guide the designs of vaccines. An example of this is a vaccine that was developed for meningitis B or MenB. So this is a serious disease which affects the lining around the spinal cord and the brain and its most common in infants. The vaccine works really well and is used routinely. Now designing this vaccine was very hard since the bacteria which causes MenB has about 2,500 proteins and it wasn't clear which of those are expressed on the surface and could be targeted by the immune system. So scientists use data available for the surface proteins of other bacteria along with multiple data analysis methods to identify multiple surface proteins of the MenB bacteria, four of which we use as ultimate targets in the vaccine. So at HKUST, working closely with collaborators at MIT and looking specifically at surface proteins, we've been exploring using big-data machine learning methods to design vaccines or help design vaccines for viruses like HIV and Hep C. As a first step, we developed some sophisticated mathematical models which could predict the fitness of a virus from its genetic code. Now models were trained based on genetic data taken from thousands of infected individuals and their predictions lined up reasonably well when compared against fitness experiments done in multiple labs around the world. Now models allowed us to construct a particular coloured map shown here for the surface protein of HIV where the red regions on this map indicate those positions which appear most vulnerable to immune pressure. That is, if a vaccine could elicit immune responses directly towards those red regions, the mutations needed to escape would likely lower the fitness of the virus and this fitness would likely not be easily restored through other mutations. So constructing a similar map for the surface protein of Hepatitis C, the first thing we see is there is a lot more red. And this suggests that while designing a vaccine for both viruses is very hard, it may be comparatively less so for Hep C. And this is also consistent with the fact that 20 to 30% of Hep C infected individuals recover spontaneously whereas nobody has reportedly spontaneously cleared HIV. So moving towards an explicit vaccine design, we looked at known broadly neutralising antibodies against Hep C and specifically looked for those that latched on to the red regions. We found a select few that did this promoting the design of vaccines that could elicit those specific antibodies. So will vaccine designs guided by machine learning ultimately give improved protection in practice for viruses like HIV and Hep C? Well, there's much, much work to be done particularly in terms of experimentation but the predictions are encouraging. By exploring new ways of designing vaccines using big data, it's exciting to think of the potential consequences for the future. Thank you.