 Yes, we can hear you but we can see your slide. You need to share your slide. No, no, it's okay. It's fine on my side. Oh, it's okay. Okay, so it's my problem. Okay. Okay, today my topic is reconstruction of transmission pairs for COVID-19 in family clusters by called sourcing data. My name is Xiao Kexu. I come from Dalian Mingzhu University. You know, data is a very important for study of COVID-19. Actually, in China, starting from in the middle of January, most of the programs and up and never has authorities began to report the detailed information will create the 19 cases. Here I saw example, example of cases. This is all the cases, all the other information is reported in Chinese. So I translate it in English. You can you can see some important information like the, this is her surname, her gender, and sorry, under the age, and under the age. And we can find some information about the epidemic, epidemic information. For example, she had closed contracts with confirmed cases full and in some in somewhere during during this time. So we, so we collect, so we collected a lot of this, this, this report, this report information of cases. This, these reports were published by official websites or social media platforms. We collected all the cases, all the cases manually because it can be do this work very accurately. At last, we collected about 10,000 records from to seven province and nearly, nearly to 200 urban harsh authorities in, in China. All this, all this data is out of Hubei province because in Hubei, we can't get this, we can't get detailed information of cases because there are, there are too many cases. And the time is from, sorry, the time is from January 19th, January 19th to March 5th. After this time, there are fewer cases in China. So we, we don't collect data after this time. Based on this data set, we can, we, we, five categories will detail information can be extracted, like, like this demographic information, past the mobility trees and so on. And then they need to do some coding. So putting process is done by, by 30, you say so systems, they work hard for, they work hard 10 hours a day for nearly a month. Basically, they are, they are, they are my, they are my graduate students. After there are hard works, we can get, we can get a database. The data fields and their contact formats are like this. They, they are divided into value category. The first one is the demographic information data field include AASH, gender occupation. And the second, the second category is the mobility trees. If, if the case is from other cities, we need to know the departure city, the destination city, and the, and the physical value data. And then, and then it's the information. So it's plodger to work, it's plodger to work. For example, we, we want to know who is affected, who is affected by who and the, and his, his contact, his, his contact, his close contact relationship is, is water. For example, for example, the relationship is husband and wife or father or son like this. Then we, then we, then we extract the epidemic, epidemic, academically time lines, just that we call the, just that we call the, the, the medicine, some, some, some, some medicine information about this case. For example, for example, the data, the data, the data, who has, who has been, who has been sent to the hospital, hospitalization like this. The final, the final touch, touch is the clinical symptom. At last, we list all the, all the original information in Chinese, and then we use the Google translator into English. Anyway, after, after, after the hardworking recording, we get the last and the very accurate, accurate, accurate case, accurate case data, data set, the, the format is like this. So now we can, it is a big data in the, for the COVID-19, because in this data set, we have nearly 10,000 cases. Actually, for the, for the, for the, for the disease researchers, maybe, maybe sometimes, sometimes they only have, have 10 or 100 cases. So we can, we can calculate some statistics, like to calculate the, the, the distance, the distribution will age gender like this. Here, we show the figures we can see, we can see it. So we can see the distribution will, the distribution will, will gender like this. We also can pull out the curable, cumulative numbers, some of them on set within this time. So we can get many, many, many information about the, about the, the, the information of COVID, COVID-19 in China. We also can get the, the, get the mobility choices. We'll, all the, all the cases here, each line, we have each line, we, here, we, we use the different colors, different colors for, for each two ends. The, the, the yellow, the yellow color means the departure, departure city. The red color means the, the, means the destination city. We can see most of the cases are from Hubei province. The province are colored by their numbers, no collected case reports. We can see here, most of the cases are near, near to Hubei province because there is a, there is a large traffic flow among them. So, so there are more, there are more cases near Hubei. Up now, we build a database, go near 10,000 cases. I think, I think here we list the some possible downstream tasks here. For example, the forum is about machine learning, artificial intelligence. Yes, AI technology can use, can use too many, can use too many, many things. If we, if we use AI technology, AI technology to extract information for this, for this data set, for this, for, for the, for the case reports, we can use this data set as a benchmark. In this talk, we, we are focused on the reconstruction, we will transmission events. Now, we introduce the process, the reconstruction, the transmission event and the cluster. Based on, based on these pieces, we identified about 1,000 and 400 transmission pairs with the strong evidence that the effect, was the effected by the effector. Each transmission pair is established according to the travel history, social relationships, or other information provided by public case reports, which were also validated by two research assistants, or Swiss research assistants. For example, if a person travel to Wuhan, and, and then he go and went back to his hometown, in a week, he and his family were, were found have, have affected the COVID-19, then they can build a transmission pair between him and his, his, and his, and the member in his family. We can call the him as the effector. The other, the other member in his family, we kind of call the effectee like this. If the effectee had multiple possible explorers, the effectors will this transmission pair is considered, so it's considered as the one reported the earliest symptom onset. For each pair, for each transmission pair, the effector as the primary case and the effectee as the second case. After building the transmission event, the transmission pair, we also consider the connected chains or confirmed cases in which we term the original case, the index and the entire, entire chain of the cases as the transmission cluster. Here we saw the two kinds of clusters. The first one is the household transmission clusters. The second one is non-household transmission cluster. You can see there are structures, there are, there are structures are different. Household means the, the, the cluster is, is formed by the family members with the same, same house. The cases belong to the same household. The non-household transmission cluster is grouped by non-household relatives, like the brother, like the brother's wife, like your aunt, your uncle, like this. Anyway, they, they should not live in the same house, colleague or classmate, friends, and other face-to-face contacts. They can see because that low family is small. In China, I think the most family, most family has three to two to six members. So it's sad, it's sad. So the transmission cluster is small, but for the non-householder clusters, we can, we can regard it as the social community, social community sporadine, public sporadine. So we can, we can find some super sporadine cluster like this, like this. We also saw the hazard, the hazard refraction stratified by H for the household relative to non-householder transmission. Here, the red means, the red means the high probability low effection within household. The blue color, blue color means there is a high probability, high probability low effection happens outside the household. These lines, these lines are primary cases. This column, these columns are the result of second case species. We can see here, for the second cases, the groups, the group of children below, there is, it's below, it's less than 17. And the older people, the older people, the group, the group of people, they are larger, they are older than 35 years old. They can, they can find the high, sorry, the high probability, the high probability in household, because this value is, is larger than, larger than one, especially in, especially for the group. They are, they are affected, they have this high probability, probability affected by their, by their father or mother, or their grandfather or their mother. If we, again if we, next, if we look at the primary cases, we can find for the, for the, for the older people, for the older people, they have the, they are more prone to cause household infractions because this value is, is the largest, right? We also compare, we also show the, we also show the hazard, the low infection, we also show the hazard, the infection stratified by gender for the two, for the two transmissions. Again, we can find for the, again we can find for the different gender, different genders, like the two, the two cases, it is, they are high probability for household transmission, but for the same genders, they are high probability to, probably by the probability to transmission, to transmit their disease out of, out of the household. The reason is, in the family, the husband can transmit it to, to his wife, to his, anyway, to his couple relationships, the two people, the two persons can transmit this disease to each one, right? But in, in the workplace, we, we, we face mostly, mostly, mostly we, we contact the same, the same genders, so here we can find the high probability just in the same gender case, like this. After compare the household and the household transmission structures, we also, we also can, we also can find the household and the household transmission can be coupled, can be coupled together to form a larger transmission. Here we saw about 600 transmission clusters here, like this, especially we show the, we show the special, we show the special cases, the super spreading cases, is here it means the size is larger than six nodes. We, we can see, we can see here, we can see here, we can see here, the, the sweet, the sweet nodes are all the primary cases, the household, the household, the second case, and household, the second case, all, all in this, in this, all in this, in the super sporadic, super, super sporadic clusters. Here, we, here, after that we wanted to compare the, we wanted to compare the COVID-19 to SARS and the seasonal flu, So, we can find here, we saw the cellular interval and the incubation period, low size. We can find, we can find, for size, it began to, it began to, it began to affect others after simple term onset. So, like this, this is the period of time, like this, so the cellular interval is large. We compare the signal flow, we can, if we see here, we can find that the, the, the, the why we're setting, we are, we are happy, we are, we are happy on the incubation period here. So, even, even, even, even the case, don't show any, don't show any simple term, it will, it, it can survive this disease to other persons. So, we want to, we want to know what pattern is the, what pattern is, is for the COVID-19. So, we can find, so, actually, we can find a very interesting phenomenon, about 10% transmission pairs, the cellular interval is below, is, is less than zero, and it is the negative values. It means, there exists a pre-sym, symptomatic transmission. And this, this is, this pre-sym, symptomatic transmission will make it difficult to prevent COVID-19 subriding. Anyway, anyway, because it is very difficult to, to stop. So, we, we want to calculate the risk for the transportation from Wuhan to other cities. Based on, based on the data, based on the data on January, up to January 22, 2020, we find more than 100 cities have a, have a risk, have a, have a high risk, have a high risk, the probability is larger than 50%. The, the, the red, the red notes like this. This figure, this figure is very nice. It is not our work, but we can see after, even after the knockdown, or even after the knockdown, all the cities in China, because there is a, you question shared. So, there are, there are many, there are many translations here. So, basically, we are studied to these cases. We collect, we collect, we collect, actually each city, each city has, has his, each city deployed, different type of the event, issues, we collect, we collect the serum, and list like this. We also can compare the, we also can compare the different, the strategies, the different, different countries, different cities, such as here, we compare the cities of Xi'an and Nanjing, we can find the, for Xi'an city, the local, there are more local, local cases, in contrast for Nanjing, they can find fewer, fewer, fewer cases, fewer local cases. So, so we can, we can see Nanjing has, has, has better, has better strategy for, for, for evanation, or COVID-19, we can, we also can find that the dynamics, the dynamics, the basic reproduction number is decreased with the pile. So, so, so we, we can see the proactive social distancing in China is efficient. We also can, can find the same conclusion, we use the status, we use serial interval, here, we, in this time, in this time, we can, in this, in this time, all the, we use all the cases of this time, we calculate the average serial interval is about five days. We also, we also, we also divided the time into three period, pre-pick, pick, post-pick, we, we calculate the serial interval, use the, use the, use the three period, period cases, respectively, we can find, for the pre-pick period, the serial interval is about seven days, pick week period, we find is about four days, post, post-pick period, we can find is about two days, we can find the serial interval is shorter, right, we build a model, we build a model to show, if we, if we can find the case in time, very fast, solution, we can find that the serial interval, we are very, we are, it's shorter, shorter, but for the longer installation, we need the serial interval is larger than this case. So we also can, we use our, use our real life data set, we find the same phenomenon, that means the strategy, the, in the session ago, China government, it's very successful, because the serial interval, it could result in the, from seven days to, near two days, within a month, like this, within a month, it is very shorter. Here, I just, here, I just introduced our works by reconstructable transmission and events, I think the possible, possible task in future, we can do some, do some more works, but actually I think this data set is belong to, to, it's belong to, it's belong to all the researchers, we were, we were open, we were open source the data set, we try to open, try to open source the data set on the journal or scientific data, even though we can provide this data set to some interested researchers. This is my, this is my talk, this is all, thank you to my, thanks to my collaborators, thank you. Thanks very much, Professor Xu is a very interesting and timely talk. So there are some questions, can you see the questions in the, the first question is, would you please give us some idea about the effective R, R zero of the superspriders alone? Ah, sorry. Can you repeat the, the, the, the, the, the last, about the effective R zero of the superspriders? No, we, we can't, we can't, we have tried a lot, a lot, we have tried a lot of measures because we, we have no, because we have no data, the family size, no tenure, so we can, we can get the reproduction, the basic reproduction number, though all the cases, so for the super super writing cases, we also can counter get the reproduction numbers. So we, we, we only can get the transmission, sorry, we only can get the transmission pair, the transmitting event, so we kind of got the results. I think, I think, I think he'll see a specialist though this, this field. There are also one question, is there any generic similarity between the social behavior of the superspriders? Yes, maybe. Yes, I know. I think actually, actually, actually, we find the, we find the correlation between the, the, the homing behavior and the super write, under the, under the effect, under the effectors, under the effector, he effected, he effected, but, but we, we have open source data. I mean, yes, if we, if we have effected many persons, yes, he have traveled many places, he contact many persons, yes, but, but maybe someone, someone have gone a lot of places under the contact, I mean, case, case, which case, who have contact many persons and have gone a lot of places, but he, but he didn't, didn't effect, didn't effect any persons. So we don't know, we don't know this case. So, so, so I, so I don't, I don't, I don't get the conclusion. Basically, basically, the two question is, I think the, the two question is about the sort of coming local cloud sourcing data from the government. It is different. It is very different from the data obtained in hospital. But, but it's good advantage is it has, it has a very, it has a large scale data set and it is, we can get the data set very fast. I mean, we can get the data in time, because the government publishes, publishes this data, this data every day for, for, for the researchers, for the researchers that can, can try, can try to use this data to do some, to do some, do some works. Thank you. Okay. So due to the time limit, so let's, well, let's thank Professor Xu and Matthew. Yes, yes. Yeah, yes, it's your turn. Okay.