 Hi, my name is Christopher Michael Stewart and I'm going to be presenting today on data science, human language technology and natural language processing. Okay, so just a little bit about me. I did a PhD in French linguistics at the University of Illinois, then worked for a few years as an assistant professor of French linguistics at the University of Texas at Arlington in the modern languages department. I then left that job and worked as a voice engineer in the text to speech research and development division at Nuance, which is now called Serence in Belgium. And while I was there I developed the Siri voices, took on a major tech face and major customer facing role, and had a lot of good beer in Belgium. Then I left that job and got two jobs as a senior data scientist first I worked at Tata consultancy services in in Arkansas, developing a machine learning pipeline for Walmart, and then worked at a tiny startup called narrative wave, doing time series modeling of things data. I now work at Google as a computational linguist. And my team works on detecting policy violations at scale best practices for crowdsourcing tasks and I build automated data quality reports before I go much further I'll mention that I am not speaking on behalf of Google here and only talking about my experience and background. Okay, so I've set myself up the impossible task of in 45 minutes talking about data science human language technology and natural language processing, each one of these is an entire careers worth of information and there's no way I could possibly do much justice to all three of these fields in 45 minutes. So what I've aimed to do then is hit the sweet spot between these three fields so what do these have in common. And there I'm going to tell you that what I think that they have in common is something that I'm here calling predicted probability. So these are all fields that are all about modeling statistical modeling in particular machine learning and trying to trying to figure out something about make some sort of inference about the future and use data to predict that so what does that look like. So here's I have predicted probability and then ml so machine learning or deep learning deal and read more deep learning than machine learning. So under that I put the short names for these and I'm going to be using DS HLT in an LP, and I put some examples of what's called conditional probabilities. So a probability is denoted by P open parentheses, the probability of some event right so the probability of a fair of a fair coin coming up heads is point five. We also condition probabilities on things. So in data science if you're working as a data scientist at for instance at Tesla or somewhere like that. You might be interested in predicting what is the probability that an object going that's that appears in front of a car will be a human. So you're, you know, it's a predicted so will be a human. And it's probabilistic what is the probability this object is a human, given data coming from thousands of sensors historical data metadata, all sorts of stuff, right. Human and if you're working in a human language technology and you're working for a text to speech synthesis team, you might have a whole bunch of segments a whole bunch of cut up vowels and consonants and you know things like that. So just in predicting what is the probability that a segment is good for a text to speech synthesis context, given a language models spectral specifications duration focus, you know narrow or broad focus part of speech etc. If you're working in LP you might be interested in predicting what is the probability that a tweet a tweet will have hate speech, given metadata about the tweet, the actual content the language in the tweet, the time, it was tweeted the place it was tweeted who retweeted it how many retweets did they get all the sort of information. So this is this is the sort of world that you will live in and so I'm going to start this talk, talking about probability and predicted probability and machine learning. And this is probably very different from the other talks that you've been that you've been to because I'm really going to start by the part that you might not know very much about and work back in the second half of the talk to an actual language context. Let's start with predicted with with talking about probability and machine learning and things like that. So if you don't like that, I'm sorry, but it is the common thread here amongst these three subject areas. So, before I start though before I go to in depth here I just want to emphasize that if you have never taken statistics or have no idea what machine learning is, or are afraid of equations or you know whatever the case may be. So if you don't worry, just try and sort of absorb the ideas here. So like for instance, this conditional probability event. You can imagine what is the probability that's going to rain on on Friday this coming Friday right. You might have some sort of naive probability but if I tell you that it's Monday, you might have a very different prediction than if I tell you it's Thursday, closer to the time we have a better idea of what's going to happen right. So try to engage with the ideas, even if it's, you know, sort of foreign and seems odd or hard to get, just try and get some of the ideas don't worry if you don't get all of the particulars. Okay. Think to yourself, have you ever taken a class in statistics. I can only see a few people raise your hand if you have a statistics costs. Okay, if you haven't had a statistics cost no worries, not a problem. But the interesting part for our purposes about about statistical modeling is that so statistical modeling is when you take a set of data and you try and build a model to do something and you know the famous quote all models are wrong but some are useful. So the point is to try and figure out something with this model right. In social sciences we build models, we collect data right language data in this case for instance. And we build a statistical model mathematical model that of that data. And we assume that this model tells us something about what is going on in the data. So if you have a regression model, you will have parameters in your regression model and it will tell you something about what, you know, what, what is the, how predictive is, you know, someone's language background of the language that they use or, you know, something like that. So we assume that these models reflect some sort of underlying process, because of that we're very interested in the parameters the parameters the important parts that's the interesting parts about the model like, you know, does someone's age tell you more than their L2 or you know whatever about some sort of linguistic behavior. Because we're interested in parameters work we prefer simpler models in this statistical modeling culture. And this is prevalent in causal research but not so much in engineering I forgot to mention that this, I'm taking this from a paper by a statistician named Leo Bremen called statistical modeling the two cultures and it's it's helpful for understanding how these two cultures differ. So that's data modeling if you've taken introduction to statistics you've talked about things like this, about t tests and a novos and regressions and logistic regressions and all that sort of stuff right. So there is a different cultural statistical modeling that Leo Bremen in this paper calls algorithmic modeling and algorithmic modeling we do not assume that the model that we build reflects some sort of underlying process there's no assumption that what comes out of the model is about reality. We're not really interested in that what we're interested instead in is prediction, we're interested in predicting what's what's going to happen with this data that we have in the past. Because of this we don't really care about the model being simple, because we're not really interested in what's going on in the model all the time we're more interested in prediction. So not afraid of black box models bottles that we don't really know what's going on necessarily in every tiny little part of the model. There's lots of training data and lots of computational power, typically. Now there is, if you've never heard of this algorithm modeling sort of world and deep learning and machine learning and all that sort of stuff. But you have taken a statistics cost there is an intermediate between these two as a field called statistical learning. And there is a book written by hasty and tips Ronnie called introduction to statistical learning that I really recommend here, if you've never read it. Okay, so let's talk a little bit about machine learning so machine learning great buzzword what does it mean. So in traditional in traditional programming you take input, you make a program and you get results up for instance, and a popular coding problem is to say, I'm right a function where you give this function input a number. The number is divisible by five you output fizz and then the number is divisible by seven you output buzz or something like that. So you have input and integer a program, this thing that says, you know, is this number divisible by five is this number divisible by seven and output something does something. Machine learning is a little bit different because we take input and results, and we put them into machine learning model, and what comes out of it is something like a program. So this isn't this and this it's important to know that this has nothing to do with language, you could be this is basic binary classification so you could be looking to classify. You know this is in this I have like this funny sort of toy example of is this a chihuahua or a muffin. But you know you could be looking to like predict is it going to rain or is it not going to rain. Is this person likely to be likely to click on this link or not click on this link or whatever any sort of binary classification. So for the purposes of this little demonstration we're going to talk about this funny toy problem in visual binary classification, which is is this picture chihuahua or muffin, which I took this picture from an article where you can see that actually chihuahuas and muffins are not always that easy to disambiguate, which is kind of scary. So we want to build a classifier with us what will we do we need labeled data. So we need an image one and image two and image three and image four, where we where we indicate is this thing actually a dog or a muffin or a muffin or dog or whatever. And we put that into a model and the program that comes out of it will just call for now dog or muffin. So we build a classifier and then we use it so we start with these this was called training data and these four instances here image 123 and four. We do a new an unseen image into this and use the classifier so tell me what I asked the classifier, not literally but we have the classifier predict is this a dog or a muffin. For instance, in one case it could say, I'm 95% sure that this is a dog. Right. We look at the expected results so if we if we put another image image six and it says look I'm 55% sure so not really all that sure, but 55% sure this is a dog. So we look at the expected results. Oh, it's a muffin. So I wasn't real the mission, the model wasn't real confident that it was a dog and that's good because it wasn't a dog was a muffin. So we can do this over and over again image seven image eight image seven 64% chance it's a dog image eight 92% chance that it's a muffin. We're wrong with image seven and we're right with image eight. And you can go through this over and over and over again. So this is kind of the basic intuition behind how this works and you can see that what we're interested here is predicting so I don't really care what's going on a dog or muffin. I don't really care that the model is looking for years in an image or, you know, how bright or dark the images or whatever it doesn't really matter. I'm really interested in how accurate the model is that predicting whether the image is a dog or a muffin. And just to reiterate, first of all, trigger warning the next slide does have equations so if you're scared of betas and X's and Y's and things like that. And Greek letters of all sorts of your eyes on the next slide. So in the data modeling culture we're most interested in understanding what the model tells us about the data generating process. So if you if you collect data about language usage, you will put into this model whole bunch of information that you think helps you understand what's going on. Right. So you're interested in the data generating process. All data goes in the model, and you're not really concerned with prediction. If you go to publish an article. You're never going to come to you and say, Hey, look, I found this other the speaker that you didn't talk to who has these characteristics, please put those characteristics in the model and tell me how predictive they are of this person and I'll tell you if you got it right or wrong. That will never happen because that's really not the point right. You're really interested in what's going on in the in the in the data generating process so an algorithmic. The data modeling culture were primarily concerned with prediction. So your priorities change. So in this, in this sort of world, some data could help you make better predictions and some data can in fact help you make worse predictions. You can end up in instances where you have not very many observations, but millions and millions and millions of variables like genetic arrays and genetic testing is a good instance of this. Sometimes you only have a few samples of a gene, but you know you have as many. I don't really know that much about genetics, but as many of the little, you know, pairs or whatever of gene do dads. It's sorry it's a long it's a it's late in the day I'm not very eloquent here but you get my point. So sometimes you have more observations than more variables than you do observations, and that might be a problem for your model so what might you want to do. If you were one of the person, one of the people who earlier said that they had had introduction to an introduction to statistical modeling class, you will be familiar with this data modeling equation y equals beta not as beta one x one beta two x two beta all the way to beta p xp. And if you've never seen this and this looks confusing and crazy. Don't worry about it. Imagine that you want to predict height given weight. Right, you if you have you'll have an x and a y axis on the x axis might be height and the y axis might be weight or vice versa doesn't really matter. And you would have a whole bunch of points there right, and I tell you hey look I want you to predict a new person that you haven't seen given height and weight. And I'm going to put you know this one point in here on the what on the y axis, and I want you to tell me what the x axis will be well that's what you have here right. You have the y is the thing that you want to predict the x's are the things that you that you already know. And the betas are the weights on those things and what you want to do obviously is you want to draw a line that minimizes the distance between all those points and that line. Right, it's called the line of best fit. So that's what this minimizing RSS you want to minimize the residual sum of squares. So let's say that you have so much data that you actually have a problem because some of those x's are not very useful so you want to be able to adjust these betas right. It's very simple this this whole thing is the exact same this is machine learning this is called Ridge regression. And the equation is the exact same except for you add this little penalty term here at the end. This says look don't let don't let my, my betas get to my squared betas get too big. And this this lambda parameters upside down why allows you to sort of turn a volume knob on all these betas right. It's so it's very simple it's a very simple thing and that that has been taken you from statistical modeling sort of the data modeling world to machine learning. Okay, now, obviously that's not really the state of the art the state of the art is something more like much more complicated algorithmic modeling. Here you have like a basic neural network. And this is what the state of the art is. So this is what the people who actually, you know, sort of build these production models at big companies this is what they do, which is, you know, it's just a one iteration of slight bit more complicated but my what I'm trying to impart here is that this you can you can start it if you understand ordinary least squares regression this top equation, you can easily go to statistical learning which is this next one just by understanding what's going on in the model. And then once you understand that with a little bit more work you can go to this algorithmic modeling world. So the intuitions here again are very simple you can have a very a model that doesn't do a good job of predicting it's just kind of makes random predictions right. Or you can have a model that's very in tune to this training data that really understands that the the all those chihuahua and muffin pictures that we showed to it. But if I show a new picture, it doesn't know it has no idea that it's a chihuahua or muffin. So you want some sort of sweet spot where you're not making completely random predictions, but your predictions aren't so tight so closely tuned to that training data that it doesn't generalize. You want to you want to find this minimum error here and that's this is referred to as the buying bias variance trade off. So we're going up our discussion of machine learning here how effective is modern machine learning well deep learning. Yeah, I put a few links here you're welcome to copy them down and find the articles. It's not always great. I mean, as you probably have experienced in life, but but it is quite good. There's a massive need for labeled data and who labels this data and how do we ensure that their labels are good. Now we're getting to our world linguists are actually pretty good at getting data from humans right and so linguists often work on this in industry. Okay, so this is, this is, you know, maybe more than you have sort of taken on in your explorations of statistics. So how can you start learning about machine learning well. One thing is to remember that all approaches, even the most complex deep mind model that you can possibly imagine has X's and Y's it has things you want to predict and things you already know. It has weights that go on the X's that predict the wise, and it has air the model is wrong, and you want to quantify how wrong the model is. So you want to remember that all these approaches have these ingredients. If you want to start learning about machine learning you can develop some, develop some statistical intuitions. Remember that that basic inferential statistics will serve you well going forward. Do your own analysis as much as possible make sure that you understand ideas like statistical assumptions normality is a normal test model assessment. So just go to stack overflow and enter in some R code that someone tells you builds a linear mixed effects model it's not that that will not serve you well. Read an introduction to statistical learning if you want to go beyond inferential statistics. If you're already there and you're looking for something more complex. Find data, Bayesian data analysis and if you want more complex than that, then you're in the wrong place because this talk is not for you. So that was the part about machine learning. Thank you for sticking with me through statistics. Let's talk now about about natural language processing. So now we're going to turn to language. So how are these modern, these machine learning models used in natural language processing. So this section is just a brief peek into the kinds of consideration that go into an LP pipeline. This is not state of the art in LP. This is like, you know, if you want to learn Python and you start and they say, Okay, you know define a variable that's a string, and now capitalize the string or you know whatever that's kind of this equivalent so I'm not I'm not proposing to you that this is like state of the art incredible in LP, you know knowing these things is not going to get your job but it is a peak into this world. Okay, so what, what is involved in natural language processing so this is just a sample sort of pipeline. Things you might want to do if you were building a natural language pipeline with two sentences Moscow has also Moscow also denounced what it described as the rise of quote nationalist and neo fascist sentiment in Ukraine's western areas where it said Russian speakers were being deprived of it is repeatedly expressed concern for the safety of Russian citizens in Ukraine. So the first thing that we might be interested in here is defining a sentence boundaries. Right. The sentence boundary you could very naive approach would be to say anytime you see a period that that's the end of a sentence that's going to be problematic. Right, because if we have something like AD, I don't some sort of an abbreviation or USA or something like that. You could have periods that are not sentence boundaries. So again, you're going to need to train a probabilistic model that tries to predict when sentence boundaries are, and you know sometimes be right and sometimes will be wrong. Okay, the next thing you might want to do is to divide all this this sentence up into tokens. Now what are tokens tokens or something like words, but importantly, they may not always align with what you think of as a word boundary. So for instance, one one interesting thing that I see right away here is this Ukraine. So Ukraine apostrophe s most people would say that's, that's a word, right. But for the purpose of tokenization, we might be interested in saying that this apostrophe s which indicates possession is a separate token. Okay, so there's, again, this is something that you're going to have to model probabilistically after this part of speech tagging. I have a speaker note here but I can see it but I think it's part of speech tagging the state of the art in English is something like 98% correct 90% accurate. So again, this is an instance where you would build a model and probabilistically predict what these parts of speech are. I'm sorry if you're a syntactician but this is, I'm just going to skip over this in the interest of time and move on to entity detection so going. So I've started at sort of a very low level tokenization, then part of speech tagging going up further syntactic parsing parsing and then even further up than that to entity detection so you can see that you want to be able to model the fact that some of these are sort of related to it. So, Moscow and it have some sort of relationship, etc. So this is called entity detection or named entity recognition, you might want to cluster them right so that Moscow, it are custard what and rise, Ukraine, etc, etc, etc. These are just the kinds of considerations that you would want to take into account if you're going to model if you're going to have a computational representation of what's going on in these two senses. So again, all of these are predicted using deep learning models and state of the art natural language processing. And if you're interested in deep learning models. If you're this guy at the bottom here, that's a visual representation of a deep learning model. This and subsequent steps are typically implemented by research scientists and computer engineers. It's not often the case that linguists, because we are not really trained in, you know, math and computer science and things like that are writing these models from scratch. And more, a lot of a lot of this used to be the case that hyper parameter tuning and things like that you had to do by hand but increasingly even that is is solved by the computer by just grid search and things like that. So this model will contribute domain knowledge to identify rules and identify areas for improvement. Lingus have a lot of obviously a lot of experience with, you know, what language should look like and so when the model gets something wrong are pretty good at saying hey I think that this is what the problem is. And once you have enough of those you can identify rules right you can say hey look it looks like the model is not great at dealing with possessives or you know whatever the case maybe. And to provide and linguists also work to provide annotated data that train these automated processes. So you might think oh wow this is all pretty impressive. So NLP is pretty much what people call a solved problem right. Well not necessarily. You probably recognize that you know a lot of, a lot of machine learning can kind of go wrong and do wonky things and the other day I was working on something and tried to get some sort of digital personal assistant to play a song and I said the name the name of the song in Spanish so like, please play the song. And to the thing and it says, here's what that is in Spanish or whatever and then just translated whatever I said in Spanish so why didn't it get it right. So there are some things that are pretty challenging for natural language processing so let's consider this passage really quickly. The police arrested Mayor John Smith yesterday. He was suspected of stealing garden gnomes the latest breach and create and a crime way of rocking the sleepy town of Springfield. So if we think about abstract representations of these sentences. We can start to see that there are things that are a little bit difficult right so those things look like this. The fact that we have co reference, right to entity so like Mayor John Smith and he. The police are tied to this, the town of Springfield, we have. If we look at this sentence ontologically. How do we want to represent something like Mayor John Smith. Is this an instance of a mayor whose name is john Smith or should it be john Smith and his occupation as mayor, what is the sort of preferable representation here in terms of salience. We have these event relations so the fact that we know as humans and language users that suspected comes before arrested, normally someone is suspected of something, and then they're arrested right but a computer doesn't know that you have to you have to sort of engineer that. These are inferences that require real world knowledge of criminal processes. Also the fact that there, there is something like subjectivity right that someone can be suspected of something that's sort of a subjective call that's a subjective. Consideration right and you know we we understand that but computers don't really understand that ditto or even for things like a crime wave that rocks. You know how do how do we represent the fact that this is all this is metaphorical usage, and that the town of Springfield is not actually sleepy. It doesn't it's not that it wants to take a nap. It's a way of saying that a town is calm and you know sort of be called ditto from the time. So yeah, these are challenges for for for modern natural language processing. Okay, so starting to wrap up here. There's all been a lot of information I'm sorry to sort of. I've gone through it relatively quickly. I don't know if we do questions. Is there a question period does anyone know I'm not done with these yet. But I guess we'll see at the end of this. I think there is right because you guys can put questions in the chat. In any case, I'm happy to go back through parts that might have been on the top but let's talk just briefly before that about what you can start doing now so let's say you're in a master's program and you think oh this NLP stuff seems kind of cool you know I've I've taken an introduction to Python and and I took statistics 101 or whatever what what should I do next. This should be yourself sorry. So the first thing is learn, you're going to do a lot of independent learning so there's this is not the sort of thing where you turn up and you know someone gives you a job and you know immediately you have a job for 50. There's a lot of sort of you have to invest a lot of your own time into learning things here. So learn how you learn best. That's the first step. You know, a lot I was in the office hours the other day and you know several people said, hey, you know I took, I'm taking an introduction to Python, and I'm sort of interested but it's boring, and it's hard to stick to. And I just don't really like it that much. And my advice was it will then find a project, you know figure out some sort of project that you're actually interested in like I'm interested in. I don't know like I studied Georgian or something and it has this very interesting syntactic property. And I mentioned I want to build something that can predict that or I wanted to suggest a better way for my favorite software package that handles to better model, you know, whatever it is, find a project and contribute to that project a lot of natural language processing software is open source so you can you can find the people that develop it. Take the code is called forking the code so you just make a copy of the code. Look at it change the part that you want and send the change back to the person who develops the code that's called a poll request and say look, I really admire your work and think it's very cool. But I found some I found something that I think can be improved. Please take a look and see what you think. How do you measure progress, you know, learn, you're going to have to present what you've done in the past and what you want to do in the future, and sort of give some sort of indication of the progress that you've made on your various projects. And that's going to be extremely important when you go into industry. This works really well for technical skills right you can say, I take an intermediate Python, and I, I have, you know, this change this change and this change that I made to the natural language toolkit package that are now in production or whatever so it works really well for technical skills. You have to do side projects so find problems like I mentioned and fix them. You know, it's not always the most glamorous and interesting work but guess what it's what you're going to be doing if you get a job on these fields. And so yeah find a problem and fix it. There's paid work so big tech companies for instance higher paid interns every summer. It's volunteer work. I mean, you know what I said about working on something like an LTK that's, that's free. I mean, you know, you, you sacrifice your time you get something out of you could experience. Maybe the most important of all of these things is to keep records of what you did. It's nice to tell people I know about statistics and I know about programming and I know about this I know about that, but it's much more convincing. If you can point to things you actually did. If you want to be able to point to things you actually did the way that you would do that, if you make a web page, right, a personal web page, you make a, what's called a repository a code repository, and you point from your web page to the repository and the repository has project one project to project four project five, and the web page does a beautiful job of explaining. Here's a project one does. Here's a project to does here's a project three does. And for each one of those projects, you can point to all the code that you wrote to do the project. So you really want to be able to prove, you know that you, you have whatever kinds of experience that you have. So it's really important to keep records really really really important to keep records. So one of the first chances to use new tools be sophisticated about data so this goes back to the slide where I said, look, make sure you understand the models that you're building. A lot of people build statistical models and have no idea what they mean. Don't be one of those people be sophisticated about data apply for an internship, like I said, there are a lot of companies in tech that hire interns to do. They have access speech synthesis and automatic speech recognition and natural language processing and all sorts of stuff like that. So apply for an internship automate your drudge work so you, you know, is it the case that you are working on an R script that is now 3000 lines long that has all of your dissertation research in it. So, stop, stop doing that, figure out ways to write functions and then call the functions and write unit tests to test your function so that you know that they're working as intended. Automate your drudge work and finally prove it make artifacts and understand impact so if I can just impart one thing from you from to you through this session. If you're interested in, in this sort of line of work, make sure that you can show people your work, make sure that's very, very important. It's nice to say I learned Python, it's a lot better to say, look at all these contributions that I have to nltk. It's very easy to sit there and sit through an introduction of Python costs. It's much more difficult to go and contribute to a software package. But one will show people that you know what you're talking about and others is kind of like, well, maybe they know something about Python. Maybe they don't who knows. So, I think that that's where I stopped here. But I thank you for your time and we have about 10 minutes left for questions. Hi, Chris. Thanks again. So I was just wondering for those of us who have taken the compositional semantics course. What kind of keywords how can we probably that into relevant language that you know people understand what we actually worked on. From a computational semantics course. Compositional semantics. Compositional semantics. Compositional. Oh, okay. Okay, that's okay. I think it's very related. It's like, you know, identification and everything. Well, well, so one thing I could I think that compositional semantics is about is ontologies, presumably you've worked on ontologies and embeddings and things like that. Okay, so that that's kind of the currency of the realm in natural language understanding. So natural language understanding is kind of the province of like the digital personal assistance and things like that. Right. So if I tell my whatever Google home or, or, you know, Alexa or whatever. I love my mom. Why don't we call her now. You know, you need it some sort of represent an ontology that says mom is referred to as her you know whatever however you want to envision it. So natural language understanding is a good and has the sort of added benefit of that it's it's a very hot field right now. They're hiring lots and lots and lots and lots of linguists to make lots and lots and lots and lots of ontologists. In other words, you can you can look for a zontology. Yeah, I wish I knew more about compositional semantics. Yeah, I think it's like very helpful for it. Translation into working with competition because there's, you know, you've landed lambda functions in order to like throughout how it's related to everything. I think it's great. Any other questions in the chat. First, would you suggest to put academic output together with side projects on our website. Yeah, for sure. Definitely. And yeah, there's so there's nothing wrong with with talking about what you've done in academia. So my first job outside of academia was in Texas Speech Research and Development, and the director of the order whatever that I was joining sort of said yes and me all your, you know, academic articles I'd love to read them and then I turned up for the interview and he said, Oh, thank you for coming. By the way, I didn't read any of your articles. So, you know, don't expect that people are going to read, you know, 750 page articles, because you're interviewing with them. They're not. But there's nothing wrong with putting the fact that you wrote an article that's in language or, you know, journalistic linguistics or whatever on your website. Okay, it certainly is impressive. I mean, points that you can, you know, you can make deliverables, but it's probably not realistic to expect everyone to read all of your articles. It's just not how industry works. Next question. Is it necessary to learn Python Python for NLP work. I've done some Java at beginner level. Can I use that. It's certainly not necessary to learn Python. Python is nice because it's very simple. It's a scripting language a high level scripting language so it has the benefit of being relatively easy to read compared to Java which I'll be honest. I look at Java and I think public class this public cost I have no idea what all that stuff means. And it's hard to follow so if you've already done Java, Python should be a piece of cake. It's certainly not necessary to learn Python for NLP work. Right next question. In your experience, are there opportunities to work on these kinds of projects with the team strictly as a linguist without having coding knowledge. Yeah, I get this question all the time do I need to know how to code. And you know that the, the truth is, you don't need to know how to code, but if you are not willing to interact with. If you want to work with engineers so if you're going to work in tech, you're going to work on a team you're going to work with engineers, there are engineers and every engineers the ones like should do the stuff. They write the code that you know makes things work that people use. So if you're not willing to engage with the ideas and code and sort of throw your hands up, you're going to make a rod for your own back so you don't have to strictly speaking know how to code, but it is always going to help you, it will never, ever, ever hurt you to know how to write code. And generally speaking, you don't have to know it, but it will always help you. The next question asks, where would I go to look for side projects for linguistic annotation that might be a few hours per week. Yeah, so this is a great question. I mean I don't, I don't really know I've not done it but at the question the companies that I've worked at we hired we contracted with with vendors who sourced or who hired people who did the annotation so when I was at nuance I think that we worked with either was up in Butler Hill or is now up in Butler Hill so that's one lion bridge I know does this too. And there are a variety of companies that do this sort of thing. On the next person would like to know the name of the statistical learning book that you mentioned earlier in your presentation. Yeah, so the authors of that book are hasty and Tib Shrani, and the book is called introduction to statistical learning. And that's sort of the, the, the, the nice reader friendly part, there's also one called elements of statistical learning that is much more in depth. I think there's someone else added the chat of that. So yeah, introduction to statistical learning. And next question will linguists become more or less relevant in the future of NLP. This is a great question. I mean I don't I don't have a crystal ball and obviously don't really know. But I think that the, I think, as, as NLP ventures into more and more uncharted territory. They become more and more relevant, because linguists understand how humans use language right and that's kind of our, our superpower, and you know the more we get into difficult questions of, of language and wanting to model that the more linguists skills will linguistics, a linguistic skillset will be valuable. So my prediction is more, but it's a probabilistic thing. I don't really know, right. The last question that's in the chat at the moment is, I missed it. I'm a linguist and I'm learning data science how can I make sure that my new knowledge of data science makes me eligible for jobs in computational linguistics at companies like Google. How can I make sure that my knowledge, my new knowledge of data science makes me eligible well. I mean, to some extent, like this, this distinction that I've made here between natural language processing data science and human language, language technology is artificial. So you have data people whose job title is data scientists that work on language things and people that do human language technology who are engineers who don't really know anything about linguistics. So I guess what I'm trying to say is, if you are learning about building models and being statistical models and being sophisticated about using data, then there's no way that your, your work here can not make you competitive for jobs or not making more competitive more eligible for jobs in fields like computational linguistics at companies like Google. Next question. Any exciting projects you're working on right now. Any suggestions for YouTube channels or Twitch channels for watching coding live streams. So I can't really talk that much about what I'm working on now. It's highly proprietary. But yeah, I mean I can tell you that I work in the ads org which is probably the most profitable machine on the face of the earth right now. So that's kind of interesting. Any suggestions for YouTube channels. I don't know I'm old I read books and Twitch, I, I almost don't even know what that is so I can't kind of be really occur in. And we have a hand from way way live. I enjoy your talk very much so thank you for the talk. I'm wondering because you mentioned that you come from, you're doing jobs that are related to voice in the first place and then you switch to Google and other places that might deal with text as well. So I am a mathematician and I am wondering what are the, how is the job opportunities look like, how do they look like for speech scientists versus an natural language processing side of the linguistic opportunities. Well, so speech sciences is great because it gives you a lot of experience with kind of the more technical aspects of language. And yeah if you want to work in a field like Texas speech synthesis, you're working on that all the time or ASR automatic speech recognition, you're going to be working on that all the time so it's just, it's a little bit different kind, you know, sort of work. In terms of are there more robust jobs. That's kind of hard to say it's really really difficult to say which one has more jobs. So we have about two more questions to get through we are at time. The next question any introductory material you can remit recommend for anyone interested in learning about these topics. Yeah, that the book that I mentioned introduction to statistical learning is is a great first stop. There is a so that the software packages I mentioned the natural language toolkit in LTK is all open source and it has a book that's all online for free. It only demands your time and attention. So there is in addition, there are a number of good textbooks one written by a guy named Jacob Eisenstein who used to be at Georgia Tech and is now Google, who just came out with a new book on natural language processing and intro to natural English processing. I had to confess I don't know the title, but I'm sure if you look up Jacob Eisenstein book natural English processing will pop right up. There's a lot of nice introductory material, but keep in mind that it is it's not always written for you know I for people who are not technical. Not sometimes it is but you know don't be scared if you see, you know, numbers and and equations and things like that. Just try and wait through it you know do your best to understand what you can. And our last question. At least our last question right now how relevant is a graduate degree for NLP slash computational linguistics work in tech. Yeah, so I get this question, question a lot too. And so sort of the market reality is that is that it, you know, if if a company can afford to pay someone who has an undergrad degree the same salary, which one will they hire. Obviously they're going to hire the person with a graduate degree, right. You know if you can have like a crappy car and a really nice car for the same price, you'll get the nice car. Almost invariably right. So, you know, don't forget that these, these fields obey market forces and so the, the applications within natural language processing human language technology now that that are hiring the people will be the ones are the most profitable. And if you want to. You know it's not the case that you need a graduate degree to work in an NLP or computational linguistics, but it is the case that you need a skill set that will allow you to work in it. And it is the case that there is unfortunately a glut of linguistics knowledge in the world, and you know not that many jobs. And so, you know, just keep that in mind. It's important to remember to be realistic about, you know, sort of the job. Great, I think those those were all the questions for you so Thank you for your time. Have a great day.