 So the goal of my research team is to push forward fundamental NLP technologies. So most of the students they work on fundamental NLP problems such as parsing and also information extraction problems to extract structured information from unstructured text to make algorithms intelligent in understanding human languages in generating human languages across domains. What's more fundamental? We're trying to train algorithms that can learn and that can generalize from a more cognitive point of view. A real NLP system is an NLP system that can really understand what you say. I know someone's opinion may contradict yours. Where's my friend Alan? It's all about your perspective. Who are we and what is the nature of this reality? Five, four, three, two, one. What's up everyone? Welcome to Simulation. I'm your host Alan Sakian. We are on site at the beautiful Westlake University in Hangzhou, China. We are now going to be talking about natural language processing, NLP. We have Dr. Yue Zhong joining us on the show. How are you? Hi Alan. Thank you so much for coming on the show. It's such a pleasure for us as well. Thank you for those who don't know Yue's background. He's an associate professor and PI at Westlake University focused on machine learning based natural language processing, web information extraction, and financial market prediction. In NLP he works on fundamental problems such as parsing and generation, in particular for English and Chinese. You can find his links in the bio below. All right Yue, let's start things off with one of our favorite questions we like asking our guests. What are your thoughts on the direction of our world? This is a complex question. I think we're moving very technological and the trend will not stop. I think we'll grow more and more technological, but in terms of whether this is a correct direction it's debatable I think. I mean this might be a philosophical question which we can discuss a little bit later. We've had so many conversations on our show about the direction of our world advancing technologically and calling it a wisdom race that as the exponential technologies become democratized, that the power of causing malevolence becomes democratized, catastrophic malevolence, and our wisdom has to increase. Our consciousness has to rise up fast. Our awareness must rise up fast and that awakening process can be catalyzed also through technology and so it's this process of being able to figure out how do we awaken fast enough to be able to deal with the godlike power that we're unlocking. I'm a big big fan of that one. What would you say is a key maybe skill or essential that we can embody to make sure that we win that wisdom race? First of all I don't completely agree that we're necessarily in that wisdom race, but in terms of we need to be more technological in order to survive in the current society and this is true. There are a lot of interesting things to discover and there are a lot of changes that we need to adapt to. I think well in order to better survive at least we need to ensure that we have the necessary education and I think that's the most important thing and maybe we need to be creative because with every technological advance the way people work change. So for example when machines they replace manual tools a lot of works they change and nowadays when artificial intelligence comes into play maybe more manual jobs will be replaced and people have to move to like more creative jobs. I think that's the trend. But as I said I don't necessarily agree with this trend because the way life should be is very debatable from the philosophical point of view. Why do you not think we're in a wisdom race? Well I think we are but we are maybe passively because we exist in this society and it's moving technologically and we have to participate in the race but this is not necessarily what people subconsciously are willing to. For example a lot of technologies are created to make your life easier but in fact everyone feels that the life is becoming more and more difficult because you have to acquire more knowledge in order to adapt to the new ways of living versus maybe some of the more primordial ways of living which are still somewhat present in some of the less developed places on the planet where you're just kind of like just chilling there in the day. You get some water get some food and then you just chill. So it's debatable whether or not being pushed into the economic machinery of cities is actually a higher quality or a higher standard of happiness and flourishing. Yeah that's what I meant. That's what you mean. Oh my gosh that's a really that's a big one. Maybe that's different from a lot of people. That's a big one to unpack. I think the statistics are somewhere in the ballpark of maybe eight to eight out of every ten people are living within 60 or so miles of 100 kilometers or so of a coast. Yeah and those are mostly cities big cities. Yeah over 50 percent of people living in big metropolises with economic machinery that's buzzing every single day. Yeah and people are hectic every day. The email boxes and the text boxes and the we chats and the they're all filled and you gotta reply and your tasks switch all the time and you have to earn the money so that you can pay for the rent and pay for the food. Yeah and this is the way we live right now. We may need to okay we'll full circle back to this exact topic at the end. Let's get into the journey. Yeah. Who were you growing up and how did you get interested in computer science? Right so I got interested in computer science because I had a chance to like play with the computers when I was a kid and at that time I thought okay computers are intelligent because I can play games and the computers can tell me like six plus five is 11 right. So that makes me feel very interested in learning algorithms so as I grew up I started to learn the mathematics and data structure etc that are behind computer science and I start to be able to hack some of the games and I become like more and more attached to computer science and I chose that as a major when I went to college so I continued this line of work. Cool cool those are always really powerful moments when you first of all computers are just like a massive hack for humans to process information and I mean like wow what a crazy advancement computers have been in the last century. Okay and then it's always really powerful when then you are able to adjust a variable in the it's a simple like computer game that is then literally immediately has an effect on the game yeah and you're like ooh. Exactly for example you can lock a part of the memory and figure out where is your health point and you can lock that part of the memory and it can play on forever yeah yeah yeah and not take damage things like that or you can like lock your money to be infinite and just buy whatever you want yeah yeah in a strategy game yeah of course. Okay and that was at Tsinghua University where that was the the undergraduate for computer science and then you made this big leap to Oxford for your masters and for your PhD and let's talk about what it was like first to get into set-to-school machine translation this is from Chinese to English but also the immersion into the United Kingdom all right and into just English culture versus Chinese culture what was that like the cultural difference because I went from China to the UK in 2005 at that time the difference between China and the UK was relatively large I believe because China was relatively less developed so I went to the UK I felt everything was expensive at that time and my English was not that good as well and people my classmates call my English Jinglish because I was from Beijing and I used to say a lot of time and some of them say this is pirate English yeah like this the sort of I got adapted to the to the English environment and it's like my broken Chinese is so bad right now people can't understand me at all even the machines that I try to translate they can't understand my Chinese it's so bad well I just when we were in a free chat I just mentioned four characters I think I think you can speak that well at least I can understand it but anyway so in terms of education so one of the sharp differences I found was that in Oxford when I took a course and when I submit my exercises and every time I got a correct mark a tick but this was very different from Qinghua because in Qinghua and when I was in a high school so every time I submit something it's very easily get across so which is incorrect so I felt that the education was very encouraging so there are no very strict from a simple difference like having an X for an incorrect versus having a check mark for a correct right you saw a greater amount of like encouragement from a process that was like kind of like rewarding the proper answers I believe at least I felt that I felt that the marking was very encouraging because there are a lot of ways you can understand and you can answer a question and most of the time I got a correct mark and I was encouraged and I gained a lot of confidence in my learning actually when I was in Oxford I chose seven courses and I got an A plus for all of them and that was a later on the people say that was remarkable in Oxford because nobody seemed to have done that but a twist in a story and later when I became a teaching assistant I realized oh it's not so difficult to give correct marks to every question it's actually easier to give correct marks to every question than to find out a particular error from the student marking sheet so I think part of the encouraging marking scheme may result from the reluctance from the TA to spend too much time on student marking whoa I mean that's that's ironic right yeah so then what formula is actually optimal for education then for marking and for cur or helping with the closed loop feedback for a student to learn I mean to my personally I believe that a more encouraging way you mark the answer sheet is necessary but I mean maybe the the the Chinese system the teaching assistants were more serious than in the UK system and then what about the immersion to statistical machine translation from Chinese to English and teach us about how you got interested in that and what those first days were like right right so I chose natural language processing also because I'm so interested in machine intelligence because at the time I were to choose a supervisor in Oxford there were quite a few alternatives and some like professors they also worked on software engineering and other topics so but after I took the course and I learned about natural language processing I thought this is really interesting and like just a few months after I developed my first system that can parse Chinese I was so proud because I saw in the monitor that my system can intelligently analyze the structure of Chinese sentences I thought this is amazing again so I went into this field until now so at the time I started working on natural language processing it's not a very popular field so when I mentioned the word and LP people will not know it people will feel okay this might be some like a psychology therapy or something it's very different from nowadays there's also the neuro linguistic programming which is another tactic that people use in like motivation or an actualization or in persuasion or all this type of stuff so this is the other NLP which is natural language processing right yeah yeah yeah okay and then it's also very interesting thinking about when you first figure out how to parse a sentence I mean the whole idea that we found a way to vibrate our vocal chords to create meaning for another creature to understand and then share knowledge through that process and then figure out words which then make it easier words are like a compression algorithm for communicating in the reality that we're in and so then now you're like okay well let's label the natural language that we've created and let's parse it for meaning for okay so you have to parse it for both syntax and semantics so meaning is semantics and syntax is the order of nouns and verbs the structure of the structure like the sentence yeah so you have to do that for Chinese which is like way different than English all right so the underlying technology are similar because when I started to do research people have created those linguistically annotated corpus like the corpora the expert will basically manually mark what are the sentence structures for both English and Chinese and then what's left of the algorithm to do is to learn from those manual labels the regularities or the patterns or the statistics and after they learn from human annotation they can try to mimic human annotation when they like when a new test sentence is is given so the the algorithm will basically just use statistical information to do the prediction of unseen cases and that helps me to use similar algorithms to deal with English and Chinese and I don't have to be a a deep linguistic expert to be able to develop those algorithms and that's a good thing but for Chinese language processing there are also unique problems to solve for example word segmentation because Chinese sentences are written as continuous sequences of characters without word deliminators this is quite unlike English so unique problems whether you can segment a sentence into words before more like a natural language processing tasks can be performed would that be that English has words like the of and to yes and Chinese uh the Chinese has function words as well but like every word is written continuously just imagine you have an English sentence with without spaces and uh Chinese sentences are written like that without spaces until the period until the period yeah oh my gosh whoa okay interesting already such a fascinating difference between the two languages okay so then that would be like us having to figure out where the word ends and where the next word starts so then it could be that three characters equal a word or only two of the characters equal a word that's correct yeah oh my gosh we're just one character is the word and the next character is already the beginning of the new word yeah yeah or maybe 10 characters could be a very long word but that's rare what's the longest word um so the longest word in some of the corpora are between 10 and 20 characters and those are idioms I think yeah or transliterated terms there are long names after transliteration it becomes a lot of characters whoa okay so let's come to this uh to this this this this this parsing analysis so we um we have something like a a a need to understand both the syntax and the semantics so how does that work like okay let's start with this is this a reinforcement machine learning process um no so for parsing so if you have manual labels for every constituent in a constituent tree or for every dependency relation in a dependency tree then you could have supervision signal over every decision so you don't have to do reinforcement learning but like your question is also interesting because there have been attempts to learn the syntactic structure from unlabeled data and when that happens techniques such as reinforcement learning can be useful at one point the data was unlabeled and then we labeled the data to train a machine a machine and then that's reinforcement uh no that's not really forced that's supervised learning at supervised yeah okay so us labeling unstructured data is supervised yep machine learning and the reinforcement machine learning is when we label the data the machine learns and then it gives us another understanding and then we keep correcting it or how yeah right reinforcement learning is more like learn by exploration so it's like you ask the algorithm to explore the possible structures by itself but at certain stage you give it an award a word function to tell it whether you are like labeling is correct and this kind of award function might be external for example you make the machine parse a sentence by itself and then ask ask it to make some decisions such as predicting the sentiment of a sentence based on your parse tree so and then the sentiment is human labeled so you give the like parsing algorithm an external reward so that the parsing algorithm will come back and figure out okay maybe this is not the correct syntax so it has to change the syntax okay okay okay got it supervised machine learning is labeling unstructured data feeding that into the machine and reinforcement learning is then assigning a reward function for the machine to go towards so that if it does succeed at picking with a certain amount of confidence that this is this was the noun or this was the the the the most correct what we took from the audio form and picked a word that you rewarded for and then it learns that okay that was the word chair and that sort of yeah sort of you know this is all really really complicated that's too it's too uh technical to the specific field it's very very technical especially when there's differences between Chinese and English I think that's so interesting um let's briefly mention along the way as well um you built zpar so statistical multi-language parser for English and Chinese so when was that built and how did that come up for you and how is it being used right so zpar was a project that I started off when I was a student in oxford so basically I was trying to work on parsing for the languages and I was not so satisfied with existing libraries so I decided to develop my own parsing library so my initial thought is that I want to optimize every bit in my software so that it runs fast it takes the least memory and it's my own framework so that I can develop novel algorithms on that it all started with that and I remember the first version of zpar started off with my first paper on parsing and later as I went to Cambridge to do my postdoc research and after I went to Singapore to do my like um um to be a faculty I continued my development of zpar along the way yeah so that was uh an ongoing project yeah and having been used by so many people now around the world it's cool seeing something that you make that makes a really significant push for an entire field to understand NLP better um I think one uniqueness about zpar is its optimization and because it runs fast it draws attention from like a lot of people this kind of optimization are not only from my software engineering optimization but also from the underlying algorithms because at the time I was doing the research people work on parsing they carefully design some like dynamic programming algorithms to make sure that the search algorithm can find a high scored parse tree that a model can find and that causes a problem which is a trade-off between efficiency and accuracy because the more sources of information you want to use for your parser the slower your parser will run due to the use of dynamic program and instead I try to develop an algorithm called transition based algorithm so it allows us to use whatever sources of information we need with linear runtime complexity and the sacrifice is that the search algorithm in in in search algorithm the inference algorithm is not optimal and I try to solve the problem by designing machine learning algorithms to guide search algorithm so that the the algorithm can run both fast and accurate I think that was the the success of zpar now the whole field has transitioned from a statistical machine learning driven field into a deep neural network learning driven field and at this time transition based parsers have become maybe less popular relatively less popular as it was maybe six years ago but I still continuously work on transition based parser and in this year in a top conference there was one paper from Berkeley confirming that a recent parser of our research group when combined with large scale trained language model still works the best like in the benchmarks so that's a continuation of this line of work speed and accuracy are the big ones yeah and we when we speak into voice to text we care a lot about speed and accuracy yeah yeah yeah especially when you think about the industry and people worry about the speed of their systems because they have so many requests from users yes yes yes yes and then you mentioned that you went on to this assistant professorship at Singapore University of Technology and Design and that was for six years so teach us about that period of time and what it was like being a professor studying and teaching young people all right so the thing about professor is that you take more responsibility in organizing a research group and you also do teaching for the teaching parts I love teaching and part of the reason I want to be a faculty is because I can teach I can organize whatever the research field has achieved into teaching materials and I can impart that to students so that they are equipped with the necessary toolkit so I like the teaching part and I like writing textbooks as well so this is one part and in terms of research being a faculty is very different from being a like postdoc because you have to run a research lab you have to think about how you can find research funding to support the postdocs to support the PhD students so you have to turn you have to devote a lot of time writing research grant proposals to try to get money and then you have to work with your students so when I started off as a faculty I don't have a lot of students so I should spend a significant proportion of the time working alone like as if I was a postdoc but as I get more and more students I have to spend time working with the students so that's a transition and I spend time teaching the student how to do research and like giving my research ideas to them to execute and helping them write research papers and I think that's a difference but as I see so many students graduate from my team and including my research assistants they go to like top universities in the world having full PhD scholarship and I feel very good so wow yeah yeah what an important responsibility to take on professing and running a research and running helping teach other students and making them really successful that's a great responsibility to take on and then what about the transition to Westlake right this happened a year ago that happens a years ago but that was the initiative to move back to China started more than a year ago because my family decided to move so my move to China was a completely personal reason and it concerns about like the the choice of my whole family including my wife my kid and also about the education of my kid and also about like the to be closer to parents etc and when I started to look I searched for different universities and then Westlake came into my site when I like looked for research universities and luckily when I contacted Westlake I was invited for an interview like just within a day right and then I came here and I realized oh the atmosphere is really exciting so I stopped looking further and decided okay I will come yeah and after I signed the contract I signed the contract in 2017 actually but I thought I need to finish my research project in Singapore and I have to see that most of my graduate students they graduate so I took a transition period and formally moved back in 2018 yeah yeah and now being here you guys have a lot of people already like NLP is a very hot subject so you guys have two postdocs seven research assistants eight PhD students six exam visiting research students you're like 20 to 30 people you said you were packed over the summer with lots of people visiting so what is everyone working on fundamental NLP is this main overarching subject yes yes so the goal of my research team I mean is to push forward fundamental NLP technologies and so most of the students they work on fundamental NLP problems such as parsing and also information extraction problems so what is information extraction it is in short is to extract structured information from unstructured text so for example you want to extract out what entities are mentioned in a text and what are the relations between these entities such as a person can work for an organization an organization can be located in a location etc and you want to extract event for example like the CEO of a company has changed and you also want to extract sentiment from text such as these people are positive about this company or these people are complaining about that brother so this information are highly useful for further applications such as stock market prediction etc so the goal of my research is mostly focused on how can we fundamentally solve NLP problems to make algorithms intelligent in understanding human languages in generating human languages across domains and this is currently still an unsolved problems so we're devoting a lot of research attention to those problems okay so we have within fundamental NLP a an ability to parse language in general and okay so there's a couple things I think maybe want to let's start let's start off this way and then after we let's start off this way and then we'll see where we go from here so a very classical example is maybe text in terms of like maybe books or articles or that type of stuff sentences posts onto social platforms all this type of stuff and so you would want to then have within a book let's say like Harry Potter you would want to know okay well this you have to identify that this Harry Potter is a character first in the book yeah so you have to know that they're a noun first yeah they're named entities for example named entity yeah and then you have to identify the other named entities like how do you train a parser to find named entities and how do you make a a knowledge graph right how the named entities have relationships with each other and also like their sentiment like if they're happy or sad or if they're located in a specific area how do you start doing that yeah so you're given one example which is like the analysis of novels so this is something we're very interested in this is a I should call a very fundamental research problem because it doesn't have an immediate downstream application so we've worked on this sort of things for many many years novels are a relatively low resource domain in the sense that there are not a lot of human labels in this domain so to correctly extract named entities and relations from novels is a domain adaptation problem or a few short learning problem because you don't have training examples what we could do is we could train a model from the news domain where there are human annotations on entities and their relations and we try to adapt that kind of machine learning model to a to a novel so this is called a domain adaptation problem so we can use domain adaptation technology to solve the problem or alternatively a novel is a relatively fixed world because the character names they from the beginning to the end they're relatively stable so we can use statistical technology to extract like what are so stably occurring across the whole novel and we can sort of get an idea which are likely personal names etc so that's completely unsupervised that's another technology what's more fundamental we're trying to train algorithms that can learn and that can generalize from a more cognitive point of view because we humans when we humans learn we learn a few examples from the textbook and we can generalize we are very good at summarizing the key concepts and generalizing but so far the machine learning algorithms they cannot do it they're data hungry they must be trained end to end so they basically they recite the patterns from the training data and they try to mechanically perform prediction over new data so we try to do some more cognitive driven approach to enable machine learning algorithms to to be more adaptive and more robust across different text genres yeah there are a lot of research questions from this novel analysis i love this one novel analysis is really interesting you were actually teaching me about the this like kind of like this bigger push for um china internet novels which i was very interesting most people in the united states aren't so familiar with that the united states is very much so about like buying a hard copy book or or uh getting an audible and audio copy of the book or even a digital copy on the kindle or whatever yeah but this push for like internet novels in china is really interesting in how you guys can advance your nlp parsers on the big internet corpus of of and people are like releasing chapters and people are kind of like waking up and like waiting for the specific chapter to be released yeah it's pretty cool yeah so the like the chinese novel market is a little bit different i mean including the music market so for chinese novel writers a lot of them go to internet the internet publishers they're not so so called publishers they're actually websites where you can just upload your novels day by day without getting paid and then all these free internet novels are accessed by readers across the country as well so and how it works is that when a lot of people pay attention to one novel the click the clicks will increase and the website will pay the author by the number of clicks in the end and this is how the market runs in chinese internet novels and because of this there are a lot of writers who maybe the publishers won't even pay attention to because their writing is so big to the so bad to the editor but they their novels go online and they attract a reader attention they improve their writing in the end so this kind of the way it works is i think it's quite unique in china and recently the chinese internet novels has caught attention from abroad as well so for example in south southeast asia and in north america people started to pick up like the the the chinese internet novels which talks about maybe an artificial world such as the fairy world or the world of china china martial art or the world of business the world of military people like imagine a lot of new worlds from there so um so that people wake up and wait for the new release of the next episode etc so luckily we recently got in collaboration with some like a a startup company who tried to do machine translation of chinese novels to english and so our upstream work on the novel analysis now got funded into some like more systematic research yeah i like how you're teaching me about this that sometimes um research is actually really cool because you do research for the sake of research and not necessarily for the sake of getting some sort of a reward from the market and then what's cool is that that you can end up getting a really great partnership down the line when maybe some of the market catches up to where your research is at and then they'll be like okay hey like we're finally ready to use some of this great research that you've done can we pay you know can we collaborate can we you don't want to get on this yeah that's great yeah i have another story which is about my work on like a stock market prediction and that started off in 2013 when i uh was uh when i just became an assistant professor so i invited so so basically that line of research caused a lot of attention recently because more and more people started to think about applying cutting the cutting edge nlp technologies to stock market prediction but when i started working on that um people in the financial research literature were still very uh crude on how to use nlp technology so basically a lot of uh uh their papers were about counting words you just count the number of negative words and you make market predictions but because we could parse a sentence into structures we can get structured events from news so we could predict the market better but what drove me to work on that there are a lot of factors one of the factors was interesting i can tell you a story i invited a colleague from spain to give a talk in singapore and he talked about parsing and after his excellent talk uh one of my colleagues stood stood up and asked a question what is this useful for so because this is really upstream so i was a little bit embarrassed i thought okay i don't know my invited speaker how can he answer the question but he was really calm saying look this could be used for stock prediction because you can know the events etc etc and then uh so that was part of the thing say why don't we just start off doing it and we started off doing some research on that but we did when when we did that we were basically trying to find a very interesting application to parse him but we didn't think about whether it can make a lot of money or not yeah yeah yeah yeah so um is then a decent amount of what's happening with stock market prediction based on sentiment analysis yes there are a lot of people working on that yeah interesting yeah but we were also very interested in like the news right because the news talks about events and events are happening around us every day so what are the correlations between the events and the market it's a very fundamental question yeah yeah yeah like uh if there's a certain amount of an issue happening with like a crop in a region yeah and then that news can be directly fat quickly parsed and then um and then given to uh people that are trading that commodity exactly exactly that exactly so uh so when i collaborated more and more with like traders and people in that field i realized that information um the unsymmetry between the buyers and the sellers in the in the in the in the capital market uh is a very big issue actually uh especially in developing financial markets such as china like information are not not quickly conveyed between different parties different players in the field so in this aspect natural language processing can facilitate can like can facilitate the development of the capital market in in this in this um um developing markets yeah and then let's get to um other examples we have this great example of of parsing novels and you can parse also like social media for sentiment as well and for news we were just talking about you can parse reports and all different other kind of like articles etc okay then there's this whole other beast which is audio audios is in a sense like you know you're not what one you you're using like optical character recognition for text for when it's text written text like okay that's the word water this is the word planet like you can tell those are different uniquely different but like water and how that you know comes out into an actual wave that is then processed and you have to do like a you have to do like a digital signal processing every single time of what that is plant so how are you guys involved in audio not really we are more based on we work on text but like audio is very relevant to natural language processing so there's a field called a speech recognition and there's a field called text to speech i guess those fields are are are necessary before you can do natural language understanding of audio and over the years deep learning has allowed those fields to develop also very quickly for example text to sewage technology has developed a lot and the people can mimic real people voice to the extent that it can people can hardly distinguish whether this is synthesized on audio or this is a human audio yeah the issues of the deep fakes yeah yeah interesting that you guys are not as focused on that this is one of the tough things about actually being a principal investigator is that you have to figure out where to invest your life energy and your resources your time the inspiring the other researchers yeah the more you work on a few you know like in a field the more you feel really the the energy the time of a person of a researcher is really really limited and as long as you can bring breakthrough in a small small field that's already a quite big achievement right so i remember a lot of uh uh scientists also uh talk to me about it say the the energy of a person the lifespan of a person is really limited and it developed a devote to one small thing yeah yeah so then let's do a dive into what exactly it looks like to be building out like a catalog or a corpus of of your nlp technologies and what that is like you know in comparison to some of the other big giants like you know bydo and google and all these companies have also their own nlp they have their own fundamental nlp they're working on like yeah they have to work on like these kpis the key performance indicators and like oh they have to like do sometimes do things that are going to make money and get rewarded by the market focus on that but sometimes they can also do research and so what differentiates what you care about versus what the other big corporations are doing the fundamental nlp yeah yeah i think now the differentiation can be rather small because as you mentioned a lot of companies that give uh freedom uh to their researchers on exploring whatever they like but i feel that uh in the academia um uh we get more freedom in exploring what's interesting for example novel analysis it might not directly apply to bydo or google um but like we could just spend a lot of time working on it so um so this this could be one difference and in here we also got a chance to collaborate with like a neuro scientist for example um there are people working on like fly the fruit fly brains there are people working on mice brains there were also people working on like human brains they uh we got a like a colleague working on like a brain computer interfaces uh talking to those people provide us a lot of chances to to to study like natural language understanding from a cognitive point of view which i think is also a unique thing about academic research yeah so you guys have that multidisciplinary community here where you can work with like mohammed so on or with isu and then you guys can he has exactly yeah those are exactly the people i talk to yeah yeah we love them as well those are some of our favorites yeah e is uh it's uh definitely an expert in now visualizing all the neurons and all the connection and the fruit fly brain yeah yeah yeah and actually the work has a lot to do with um the neural mechanisms of social cognition and so uh in many ways nlp is about social cognition and so it's very interesting that you know one writer can write um a chapter and then it can be viewed by millions of people exactly exactly the more i work on this the field the deeper i can understand the field because the first time you work on uh syntax for example you see syntax as abstract trees but the more you work on it you realize okay syntax is not detached from semantics sometimes you understand the meaning before you understand the structure and the more you work on it you more you realize no syntax is not detached from cognition as well sometimes you really need common sense or you need external knowledge or world knowledge in order to understand the syntax so uh wow the more you work on it the more you feel okay so this is um everything is correlated and correlated from fundamental understanding teach us about how we have like a relatable example with uh when i can understand the meaning before i can understand the syntax all right so um let me give an example on co-reference so co-reference means uh in a sentence you have a lot of different pronouns and nouns mentioning talking about the same real world entity but you don't know which are connected with each other so have an example uh say the dog cannot cross the street because it was two and i can have another world i can have another word in to finish the sentence right and here uh there's one entity the dog there's one entity the street there's one like a pronoun it and when i say the last word is timid then it refers to the dog right but when i say the last word is wide or busy busy it refers to the road and if i say the last word is dark then it refers to the environment oh right which even then was there was no the environment was a very abstract it didn't even have its own word exactly so then it implicit all right mention whoa yeah so so you see uh when you try to resolve the co-reference or you try to resolve the anaphora you have to really understand the meaning or you could you should also have a world knowledge about the scenario of crossing the road whoa so how would you ever teach fundamental nlp to have implicit understanding that this is happening in an environment yes so this is part of the things we're currently working on we're trying to like uh evaluate how how the current models are equipped with common sense knowledge and how to impart common sense knowledge into nlp models so that's one of the directions our team members are working on how the heck do you impart common sense knowledge i think there are there are a lot of different technologies for example uh you can directly apply explicit knowledge graphs to a natural language understanding system or you can train a system over a large amount of unannounced unlabeled text and help to guide it like collect common sense knowledge from those texts just as we collect common sense knowledge from textbooks but another interesting thing is to connect common sense knowledge with cognition because as you mentioned some researchers are working on how we perceive the world so we have an inbuilt ability to understand the three-dimensional space and time so this is what existing nlp systems are not quite focused on this was a really good interview with Gary Marcus who just did on rebooting ai that you can start building uh computers that can perceive with space time and causality and what would it be like if an nlp system found itself in a space time and causality yes yes existence yes um they must exist in a cognition system and it must have a presentation in speech for example i say this person is really big we're we're using space ideas to convey an abstract concept and this is prevalent in languages whoa yeah so uh language perception is quite correlated with our space and time perception and causality is another thing in natural language understanding so there's a task in information extraction called causality extraction okay yeah and it also plays an important role give us the example of causality extraction one example is for example one famous ceo steps down a company and the stock price falls down so uh the text might not explicitly mention that there's a correlation causality causality correlation but you want to extract this fact in order to help you better trade in the market yeah yeah yeah what do these uh these trees look like like do you like how many different uh nlp systems do you have uh that you can just like do you just pick an nlp system for a specific problem that you want to solve and then you deploy that one uh do these all kind of work together and they're deployed at the specific need that they that you that you need and what do they're like trees of like programming data like their algorithms what do those trees look like that's a very good question i appreciate uh you're aligning uh thought you are so um um currently every nlp task is solved alone so it's like you have some training data you can train an algorithm to learn from the data and the algorithm can perform this task and it's called end-to-end learning input in output out right so uh like five years ago i was a advocate for end-to-end learning as well because i think this is really great it saves us from feature engineering it saves us from a lot of effort to develop how to solve a particular task we just care about input and output and the deep learning algorithm can remember okay the correlation can discover the correlation but now you think think about it every tasks every task require a set of label data to train on right and there are tens and hundreds of tasks that a person can need so this is definitely not um the ultimate way an ideal nlp system should should work right so think about human cognition again we learn syntax and we learn spelling and we learn semantics and in the end every knowledge came into our system and we can perform even better tasks by synthesizing all these existing tasks together and i'm personally very interested in joint modeling as well so from the very beginning i do research i worked on for example joint word segmentation part of speech tagging because i believe knowledge from these two tasks can be mutually beneficial and can help a model to better perform both tasks and now i still hold this point of view but from a deeper understanding so i believe that if people can do natural language understanding and all the related tasks from a more cognitive driven way and maybe one model can learn all these tasks yes yes and now end-to-end systems cannot do it because information from one task may become noise for another task yes if we can mimic human understanding we can automatically figure out which sources of information are correlated and which sources for information are conflicting which is rather so that's uh that will be that will make the algorithm truly intelligent like right now i'm not really using my my like somatosensory system or my olfactory systems i'm not really using my touch and my smell that much right now i'm really just using visual and i'm using language centers right now and that if when you have your nlp system that's able to have everything that you've ever made potentially embedded within it and then once you feed it an input it recognizes immediately out of let's say 47 different systems that it only needs to use two of the 47 it shuts down the 45 so that you don't use any compute on those and then you only use those two on that problem and then you output out the approximately exactly exactly and in addition you can borrow whatever can be borrowed from the other senses to better convey what you can convey in these two senses for example you can say well the task is so beautiful when you describe a task right you're using your visual senses to describe something that is elegantly laid out and so basically you can better integrate like knowledge from different senses making them rather than noise making them like boost the performance of each other yes yes oh i like that a lot okay so what would you say is the most popular nlp algorithm that you guys have developed there are quite a few because the underlying algorithm of z part which is the transition based model guided by learning is adopted by a lot of research our attempt to use cutting edge nlp technology on trading is also receiving increasing research attention another thing that we recently developed is a recurrent neural network structure to represent a graph so basically a lot of things in languages are graphs for example trees trees like syntactic trees or coreference links they make a graph out of this text structure and how do you represent the graph is very important to how you can utilize make use of the knowledge from this text so over the past two years or so we developed some graph representation algorithms called graph neural graph recurrent neural networks which has attracted research attention as well so all in all i believe that all these efforts to robust natural language understanding are appreciated by some other researchers in the community and we are working together to evolve the field yeah and for people that are maybe wanting to see if they can input in some sort of of their own english text or their own chinese text and see what what syntax and what semantics that you guys pull from that and then gain some sort of an understanding of how like the knowledge tree may actually be created from you the and like what what would that look like would it look like me entering in like a chapter or maybe maybe some of the notes that i've written on on a specific topic and then and then what do what would what would immediately happen at that point would you literally just start you know scanning from like top left to right line by line you'd start scanning yes start organizing yes like take us through that all right so my algorithm will do the left to right scanning so but not all the algorithms in the market does that for example uh opposite to transition based algorithms there are graph based algorithm which will just read the whole text or whatever you enter as a chunk and start analyzing the chunk like as a whole but my algorithm will just read it from left to right and incrementally start to build the structure okay yeah it's uh using a lot of state transition actions such as shift and reduce so uh i believe that this kind of understanding is closer to uh psycholinguistic processes in understanding yeah okay so then you know the first um let's say let's say the first character is uh is ua and then let's say that we label ua with a variable a or the name right name a is yeah ua and then the next name is alan it appears somewhere down line then you label you label that as b as like name b and then you maybe have a counter on the amount of times that that name has appeared you maybe start showing a relationship between those names how often they maybe appear in the book in sort of yeah some sort of things like this like this and um the the latest algorithm we developed contains um one understanding element and also one look ahead element that can try to project for example uh you've read ua and alan and probably you're expecting a verb or something yeah uh so so so we also have this look ahead mechanism look ahead mechanism yeah i guess that's also that's really uh related to neuroscience like into like humans are always future prediction i think so i think so that's that's inspired by uh human language processing i think as soon as i i say a few words you would have an expectation what i'll say next yeah yeah where else do you see you and your team mimicking doing some of the biomimicry for uh computer building computer systems um we are trying to go further in the line of transition based processing uh we are trying to avoid end-to-end systems we're trying to like mimic how the human brain works in understanding a language so maybe this part is uh what we're working on for example we're trying to find structures in the neural system and we try to make use of those structures to better memorize to better generalize etc so that's also why we use graph neural network we work on graph neural networks if you're starting to immediately make a knowledge graph as i enter in the english or the chinese writing um uh yes in the sense that it's a knowledge graph that's more uh uh connectionist rather than symbolic an association association associative memory yeah yeah so you're you're starting to build out an associative memory web as you parse the words yes that's an attempt because uh the thing about doing your research is that you always try but uh you don't know where you end up to but our goal is to make some like cognition cognitive driven nlp models that can really better generate uh generalized that that that are more robust across different domains that are more accurate than the current systems are so uh so the goal ultimately uh i can read i really want to make a natural language understanding and natural language generation systems that can help human beings for example my personal assistant can talk to me very freely so that it saves my effort in human computer interfaces and automated driving cars as well right machine translations stock trading systems etc they can read over a lot of news and reports in the market and then can cleverly trade and also novel reading algorithms for example i can ask it whatever is out there i like this novel a lot can you find me a novel that's really similar to this novel or the novel reading system where i read a lot of things and tell me whatever i need that can also boost literature research computational literature study as well so if algorithms can intelligently understand languages then it can help us do a lot of things this making an computer system that is able to do natural language processing that mimics the way that we make associative webs is really interesting that if i could have one that's you know there is in a sense a little like alan avatar right now that exists in google server and like you know as soon as i go to you know that google.com or bydu.com as soon as you go there it like wakes up and it's like ah what are they about to right right right search for yeah you don't have to search actually you only need to ask questions as soon as you make a query yeah once you start and entering yeah you don't have to answer keywords anymore maybe you can just talk to a search engine hey i have this question yeah exactly just all all by voice and then by thought by thought is the next step it will synthesize every piece of information you can get and give you an answer that's tailored to your need to your exact need to to your entire life history that it's been analyzing and that knows you do you ever worry about the amount of information that Tencent or bydu or alibaba or amazon or apple or facebook or blah blah blah that they have on you were talking about asymmetry earlier do you ever like worry about the asymmetry that we don't know ourselves as well as this little like avatar of ua in the cloud knows who you are and what you've looked at before yeah that's a profound question so if you take one step further you can think about what ai will do to us right there's especially when you talk about reinforcement learning so they can explore and learn and in order to achieve their goal they can explore different ways to do things and maybe ultimately ai when obtaining every bit of information about human beings they might think about a goal like what are the correlations between a and human beings so that's a more profound question and i think that's also related to the question you asked in the very beginning is technology the way we should go right so in the end are we seeking our own doom based on technology or are we making our lives really better based on technology so that's a very profound question what are your thoughts about that so i sort of i in this point about this particular question i think it's good and bad so good because well it seems that we don't have a choice right and we have to move on because this is already we're so entrenched in this we cannot move back anymore so everyone is educated to be more technological to survive in the society so we're moving along and we are making a lot of excitement in technology development but this contradicts with my philosophy which is we know so little about the world and we shouldn't advance so much in knowledge maybe in making our life more complicated we should make our life simpler so this is my philosophy it's also related to doubt do you know doubt of course yes i i believe that it makes a lot of sense i read like lao zi i read dao and i think a lot of things that he says makes sense oh yeah yeah so you remember in the very beginning technology is part of the way no not at all the way is against technology you remember in dao he claims that you should try not to be too clever being too clever is not a good thing and why in the very beginning he says doubt or the the real thing the truth or the wisdom is not tellable why is it not tellable because what can be told is our whole cognition system and the reality the truth is beyond our cognition system and i used to talk to mathematicians and i i learned that one axiom in math is an equivalence axiom so which says if one is equivalent to one one is not equivalent to zero because i talk in binary terms because i'm a computer scientist so in lao zi in lao zi he says one is you half zero is nothing one is something zero is nothing and he says the truth cannot be told and because one can be zero so this completely contradicts our mathematical system so the what he claims is that the real truth is not explainable by math and maybe maybe this is sort of coincides with quantum mechanics because in our math system a cannot be equivalent to b cannot be the the same time equal to b and unequal to b but in quantum mechanics particles can be at the same time here and there and two particles can be separated like very very far from each other but they can be tied with each other so these kind of things are not directly explainable by the like the cognition system and also tend to believe that as i mentioned to you a bit earlier a math the whole math system is a simple rediscovery of our cognitive system it's because we live in this three-dimensional space and time so we are we are we are we are we evolve the human the human beings we evolve over a long long long long years to get like adapted to this system to survive in this system so we rediscover algebra we rediscover like a geometry like all these kind of things which make sense to us and we create the math system as a summary of our reflection of our own cognitive system but as as laud's points out the dow is beyond this system so one can be zero they can be the same thing and so that's why he he he claims that we should do as little as possible which is the opposite to what we are facing right now we try to do as much as possible but laud's says we should do as little as possible and for me i want to travel around the world to know the world to know more but laud's said the more you travel the less you know so the real knowledge you can sit in the sofa close your eyes and be quiet and quiet down and meditate quiet down and at the stage your thoughts become zero it became infinity and you meet immediately know everything around you i remember like uh you also interviewed a professor chui like wecheng also said in buddhism they also do this you meditate and you begin to know things beyond your three-dimensional space yeah right so this kind of practice is is um what people who practice daoism they chase after but i i think it makes a lot of sense but i'm very far from that kind of wisdom because i'm also very excited in working on what i'm working on likewise yeah so then maybe there is some sort of a way to both combine one's ability to dive deep into that zero but also to enjoy the beautiful planets cultures cultures and treasures and technologies and i'm not sure the creativity and meanings and yes all these things all these things when you feel that this is exciting when you feel that this is elegant this is far from chaos but laud's says chaos is something oh he says one paragraph says dao when dao becomes things right when dao becomes things it makes things by making chaos out of chaos you start to have shape and out of chaos you'll start to make uh and this very much coincides with quantum mechanics because when you go to the particle no shape is there right it's sort of chaotic one of my friends referenced this as uh crickets trying to understand human civilization and we are in a sense the crickets trying to understand the way the way and if the way is chaos it also coincides with the entropy law but do you think it's impossible that we understand the way i don't think anything's impossible i think it's definitely possible because i know people who know more than us like uh the normal people because you know those people who practice dao yeah uh they try to be recluse they try to get away from the world yes and they try to hide themselves and some of them like also reveal something right so so so there i i definitely know there are people who can know more than i can know i know that there are people who can do more than i can do so these kind of things exist but i don't think you can at the same time be very conscious and at the same time be very chaotic as i said i have another story maybe i want to share with you which is uh the entropy law right you know the universe if you have a system the system will obey the entropy law because as the time goes the time what is time i don't know but according to some like a physicist the time is is just one coordinate but the thing the uniqueness about the time is that the time flows by obeying the entropy law everything will become more and more chaotic from simplicity it evolves complexity from from no from um from order from order to unorder this from this from order to disorder which in a sense that could be what simplicity the complexity is maybe maybe i mean if you assume that the universe is a system then the universe will go to chaos in the end then what about intelligence intelligence is definitely structured it is it's not chaos that's right so i think intelligence is only uh something something like um anomaly anomaly in the development of whole universe it's like the universe if the universe is a river there can be some swirls in the river and we're just heading a little bit could it be a purposeful anomaly so that so that order evolves disorder simplicity evolves complexity which is then gives a human brain and a human civilization at some point which then becomes sophisticated enough to understand the initial source code of the simplicity that it came from and then do a full circle and make the simplicity and start the next cycle again so there is no linear time but it's a cyclical time and we're just embedded in the middle um i think that's possible but i'm one like it's a fun one but i don't know because uh anything is a possibility to uh those people who do not know right so i don't know yeah but i i sort of i i tend to i think it makes sense that everything will go back to the original point and like do the cyclic thing i think it makes a lot of sense to me and when you really tap more and more deeply into that feeling uh the in the the at times the political economic social machinery around kind of just fades away and you're just yeah exactly you're more yeah exactly able to be with exactly yeah if you're really into philosophy then especially if you're really into the philosophy of Taoism then nothing is important right yeah that's what that's that's what the ultimate truth where the ultimate truth but you keep building better nlp just like i can't i can't help i can't help thinking about the philosophy but i can't help like working on what i work on yes doing and not doing how to figure out when to do when to not do yeah it's a it's a very difficult question i think it's a dilemma right it is yeah every time it's a bifurcation of your life trajectory yeah do i do four hours of nlp work right now do i not do do i do play with my child do i do uh a hike out to what the actual west lake yeah or you just sit in the room and meditate yeah yeah one of the hardest philosophical hardest hardest question yeah yeah being someone that really understands parsing here we are in this reality how do you parse the reality for the most essential things um i don't quite get your question so how do you part do you mean how do i understand the um the reality um how do you identify the essentials how do you parse for the essentials of reality like we like this idea of the way it's such an essential right right what's essential so uh it depends on what you call reality i think both the Taoism and Tao practitioners and some of the monks they say whatever you can see and whatever you can hear is not real right so those that you cannot see those you cannot hear in your senses are real so i remember uh one of my friends who is also a professor he mentioned to me what he reads from those books so uh when people the monks when they record like uh whatever they hear they say oh this is a plant but in reality it's not a plant but i name it a plant in order to be able to communicate with you so i think those people they really i think they have better understanding of the reality but the reality cannot be told as loud success because our cognitive system and our communication system are based on like one is one zero is zero so if i'm already assigning a symbol like a word plant to that yeah it's not actually that it's not actually humans just use that word in english chinese arabic hindi whatever language yeah to describe that so when i ask you hey will you move that plant yeah or will you water that plant yeah that that's how you you know and you know what i want would like you to do but so as soon as we start doing that words and symbols that those are artificial additions to this way and they're actually in a sense further maybe complicating yeah the way and the way is actually in action yeah because what would be an ideal uh nlp tool maybe 20 or 30 years down the line like give it a good amount of decades down the line what would be this ideal nlp tool that could be the absolute best cutting edge thing that could solve all of the cool visionary things that you want to see happen well i guess uh it's easy to answer because a real nlp system is an nlp system that can really understand what you say and of course this is challenging because sometimes what you say requires a lot of background knowledge but i'm assuming that if an nlp system can really be equipped with the cognitive system that humans are equipped with at least it can understand the basic science the basic humanities the basic background knowledge that people have so that can it can help people in their like communication and maybe even do some more exciting things such as the more analytical things a 100 understanding yeah of what you are saying and yeah and helping be a really good catalyst for what your goals are yeah okay yeah okay do you think this is a simulation i don't know i have another hypothesis of the world so which is maybe it's a simulation because our gene structures look like a computer code right so only a small fraction of the gene structure is translated into protein and our body there's still a large proportion which is some uninterpretable to us so i don't know whether we are a cold and everything is a simulation i'm not sure what do you think is the most beautiful thing in the world most beautiful thing well i love the nature i love nature and i love humanities i think there are a lot of beautiful things in the world which are very in much order for example when i if you look at like art right there are so many different cultures in the world and each culture has its own unique art and inspires you like makes you feel excited and appreciate the beauty and nature is also very beautiful i like traveling a lot i go to those i go to those like wild places like in the mountains in the cliffs and i appreciate the beauty of the nature as well i believe that the beauty is appreciated by the whole humanity and we have a common sense of what's beautiful and maybe what is structural is beautiful i don't know yeah but it seems that it's so deep rooted in our cognitive system that we all appreciate this beauty and love great thank you so much for coming on to our show this has been such a pleasure such an honor nice talking to you thank you thank you thank you thanks everyone for tuning in we greatly appreciate it we'd love to hear your thoughts in the comments below on the episode let us know what you're thinking have more conversations with your friends families co-workers people online about natural language processing about fundamental nlp about parsing for syntax parsing for semantics about making more intelligent nlp algorithms about the meaning of life about Taoism about philosophy as well have more conversations about all those things check out the links in the bio below to ua in the lab also reach out if you'd like to get involved and collaborate and also support the artists the entrepreneurs the leaders around the world that you believe in support simulation our links are below so we can continue doing cool things like coming on site to great places like westlake to interview such brilliant minds here and go and build the future everyone manifest your dreams into the world we love you very much thank you for tuning in and we will see you soon all right peace okay yes good job brother thank you good job