 One, welcome to our session today. It's a how-to product development of a browser extension. I'm Emily Pace. I'm the principal linguist at Expert System USA, and I'm one of the co-organizers of Linguistics Career Launch. I will go ahead and turn it over to Andrew Nelson from Data People, who is here today to tell us about this project of his, something he worked on and built while working as an industry linguist. Thanks, Emily. Hi, everyone. Yeah, I'm Andrew. I'm a linguist at a company called Data People, and we look at job descriptions, try to make them work there. And as a sort of a side project over the last eight months or so, I've been working on a product called Redefine. And so I'm going to share my screen and talk a little bit about my work on Redefine. Redefine is a browser extension, and it's also actually a mobile application as well that automatically simplifies complex language when you browse the internet. It helps you read faster, understand more, and stay engaged in what you're reading. It works by reading the page data of any website that you go to and replacing complex words and phrases with simpler alternatives. When Redefine simplifies language, it shows you an underline on the new word or phrase and a tool tip when you hover over the new phrasing to show you the original word or phrase. So in this example here, the word ameliorate has been simplified to help. And so as you're reading along, it just says that can be done to help such a situation instead of that can be done to ameliorate such a situation. And that's much easier to read. And we retain the original information here so that you can see what the original word was. So my motivation behind building Redefine. So in October, 2020, I was scrolling on LinkedIn and I saw a trend of people using jargon and complex language. Things like synergies or leverage or utilize, all these words are more complex than they needed to be. So as I read posts, I noticed that I internally attempt to reword or rephrase these LinkedIn posts. And in many cases, in lots of different contexts, writers don't use a writing aid. So like Grammarly or pro-writing aid. So like on emails, maybe they use a writing aid, maybe they don't. All sorts of different circumstances where writers don't use some sort of writing aid. And this leaves the reader with complex language. And that takes longer to read, can cause misunderstandings and make the reader lose interest in what they're reading. So yeah, that's some of the motivation. It was made for all kinds of readers but it's particularly useful for certain kinds of readers. So business professionals, students, non-native English speakers and people with reading disabilities. We found that these four groups really like redefine for different reasons but they can find use with redefine to varying degrees. So I decided to see if, so that was sort of like my motivation, what was going on in my head. As I was thinking about, you know, LinkedIn and all these like places where you encounter jargon of complex language. So I decided to see if I could build a product that solved this problem across the internet. So I built the minimum viable product in late October of 2020. It had a very simple hard-coded dictionary. So that means that the dictionary was like in the code and it was just 20 words. It only replaced words and it broke a lot of websites. So as I continued working on it, I actually got some interest in it on some Facebook and Reddit posts that I made. Got some interest in it and people thought that it was an interesting idea. So I decided to continue working on it, to continue building it and sort of scaling it to be a little more useful beyond just 20 words that are getting replaced. So, but I quickly realized that if you want to update, like if I wanted to update the redefined dictionary, so instead of going from 20 words to like 200 words or 1,000 words, I needed to have that dictionary stored outside of the extension. So otherwise the user would have to continue, like they'd have to continually update their extension and that's a really bad use case. Like on your iPhones, you don't continually, like you don't have to continually update your app to get like new Facebook posts or things like that. It just like comes in from the API and that's called a client server system. So I had to learn about that. I had to like figure that out, figure out what is a client server system? What's, you know, how do I make this into a client? What's an API? Things like that. And I found Firebase and that's really, Firebase is really easy to set up for mobile apps and extensions. And so, and it allows you to store a database and build an API, which are both things that I needed to do to build this sort of client server system. This is what Firestore looks like. It's, well, this is what Firebase looks like and this is the actual database, the Firestore database. And you can see some of the words that we simplify here. It scrolls a lot more than that. But yeah, we've got, I think we've got like three or 4,000 words and phrases that we simplify now. But this is sort of like the front end experience of Firestore. And then I can use Firebase to actually like do stuff with this. So I've had to solve a number of other problems. So that was like one major problem I had to figure out. Some of them were minor. Some of them were really major. Like for example, Redefine originally sent each note of a website to the server to get redefined. And this caused a lot of like some websites would send hundreds of API requests per website. And that's a very bad user case. So as an example, that's like, so basically what happens with an API request is the client throws a football to the server. The server edits the football and sends it back. In this circumstance, the client was sending hundreds of footballs. It was throwing hundreds of footballs. And so I had to fix that. I had to figure out how to fix that. And I did that by only sending one bigger football. And so now the client makes all of the simplifications at the same time. So talks a little bit about some of the problems that I encountered as I was building Redefine. But what does Redefine actually simplify? I showed earlier the Firestore, a little bit of the Firestore. And you might have noticed, but it was doing like one word replacements, right? When I first thought of Redefine, I pictured it as a product that simplifies all kinds of language. So words, phrases, sentences, maybe even paragraphs. Complex language is not just a lex problem, right? It's not just a word level problem. It's also syntactic. So phrase or sentence level, but it's also even a discourse problem. So that's like at the paragraph level, like there's still like complex language at the paragraph level. And unfortunately, I've had some problems actually doing replacements as some technical problems with Regex and things like that with actually doing replacements, given my current system. So currently most of our replacements are single word replacements, so lexical replacements, but we do some two to three word replacements. The larger that like multi-word phrase dictionary gets, the slower Redefine gets. So I'm actually rewriting the API right now to use Python, a specific library in Python, which is way faster than the current system. So then after I'm finished with that, I'll continue working on the multi-phrase or multi-word phrase dictionary. So let's get into some of the features. As I built Redefine, I began to realize the potential and danger of an app that can reword the internet. So I think one other use case that I recently thought about was like a parental like censoring feature or something like that, which could be useful, I think. I asked for some product feedback in a few places like Reddit, Facebook groups, and people began to give some really good feature requests. So I built some of them and some of them are more long-term product road map. But yeah, I'll get into what's currently available. And as a preface, there are three versions of Redefine. So we've got the basic version, which works right out the gate. As soon as you install it, it starts working. You've got the free subscription, which works after you log in. It creates an account for you and then you can do a bunch of stuff with the free subscription and a premium subscription, which everyone here at LCL can get free access for life. You just like send me an email and I'll get you set up on it. And here are the features. So we've got the basic version where Redefine automatically replaces thousands of words and phrases. You've got the free version where you get the ability to have your own personal Redefine dictionary. And this allows you to redefine your own words. That may not be in the Redefine dictionary, like the public Redefine dictionary. You have the ability to enable or disable websites, which you can do actually with extensions, but it makes it more like local, easier to use actually within the UI of the extension. You get learning and reading mode. And learning and reading mode, I'll demo this in a little bit, but learning and reading mode are, so by default, Redefine does the reading mode. So it automatically tries to help you read better. But learning mode is very useful for some people. It does not reword the website, but it provides tooltips and underlines on actual complicated words. So instead of actually rewording, it keeps the words and then just flags the words that are complicated and you can hover over them to see simpler alternatives. And I'll show what I mean here in a minute. And you get access to support. So you can get help, you can submit support tickets, things like that. And on the paid version, you get Redefine dialects. So it will replace like UK words with US words. And right now it's just UK to US and US to UK, but in the future, we want to expand this to like more granular dialects. So the South versus Northern dialects or other sorts of dialects. You get Redefine on PDFs. So Redefine will redefine PDFs for you. And there are add-ons. So you get dictionaries for business, medical, science and law. So without further ado, I'll actually give a demo of Redefine. So if we look at this article here on nature.com, the title says, can physical activity ameliorate immunosin essence and thereby reduce age-related multimorbidity? And if you're like me, I do not know what that, I mean, that's just like a complicated, that's very complicated. And I was like, this is unnecessarily worthy. So I decided, what would Redefine look like on this page? So we'll go ahead and refresh this page and see Redefine in action. So now Redefine has changed it to read, can physical activity help deterioration and thereby lower age-related multimorbidity? And instead of saying ameliorate, it now says help. Instead of saying immunosin essence, it now says deterioration. And we can see the learning mode by clicking here. And now we see the actual text unaltered, except now it shows little tooltips. And so we can see what these actual words mean here. Or like we can see like the simpler alternatives by hovering over these words. So this is what the extension looks like and you can manage your words here. Like I can create a new word here if I want a test and it will give little alternatives here. You can switch your preferred dialects. You can add or you can manage your add-ons here. You can switch to reading mode, learning mode and get support. So that's what Redefine looks like and that's how it works. Still working on it, it's almost, I think I work on it like maybe 10 hours a week still, but the future of Redefine is as with most language tech problems, a machine learning problem. So there's actually a fair amount of research both in academics and in an industry for lexical simplification and syntactic simplification. And I'm currently working on figuring out what's the best solution here. Ideally it will allow the reader to adjust how aggressively Redefine simplifies in the automated way. So yeah, things that I've learned. I've learned about client-server architecture, like what an API is and how to make one. Like how to make a client-server interaction work. I've learned about front-end web development, JavaScript, HTML, CSS, WordPress, PHP, learned about backend and database developments. Including Firebase Cloud Functions, Firestore and Google authentication. Learned about mobile app development, including Swift and Kotlin. And I've learned about the simplification NLP task. That's a very big task in the world of NLP and some of the relevant things that go into that are BERT, Transformers, Docker, Python, Kubernetes, GCP, which is Google Cloud Platform and language model deployment. So just a quick note, before we get the questions coming, I, as a linguist, I really appreciate the nuance that certain words have over others, right? So I appreciate a well-crafted article or a well-crafted note. But sometimes those well-crafted nuances can impact the audience, that can impact the reader understanding. And that's obviously not the intent of the author, but the goal behind Redefine is to sort of relieve or help those sorts of situations. And so, yeah, that's redefine. I'd love to open it up to any questions or... Thank you so much, Andrew. We actually already have a few questions in the chat, which I'm happy to read. So can you talk a little bit more for our audience who might not be familiar with APIs? What's an API? And when you were talking about that football, what's the football? Yeah, yeah, so an API is basically how computers talk to each other. So it's when I text to you, like say I wanna text you. I have to write out a text and then send that to you. That's basically what's happening. And say, actually, say I wanna figure out, say I need to ask you a question and I need to get data from you, right? I don't have that data on hand, but you do have that data. So the server, the client server thing that I was talking about earlier, the server is what does processing. That's what knows the answer, right? The client is the one that's asking the questions. So the client in this case is redefine. It has a bunch of data about the page that you're on and the text that it sends to the server is like, here's the data and here's your authentication information. Here's all this data. And then the server receives that data and begins to process. It begins to like make sense of the data in terms of authentication, in terms of the actual data that it received, like the text data, the page data that it gets. Yeah, an API is how they connect to each other and talk to each other. And you'll typically hear like an endpoint. So an endpoint is a URL that you contact or that you send data to and that's the server, like that's how you connect to the server and then that endpoint will process that text and send it back to the requester. Yeah, I hope that made sense. We have another question here about what the process of building the dictionary was like. So what form was it in and where and did you build that manually? Yeah, the vast majority was built manually. Like I said, there are a few different dictionaries. So we've got the one-to-one dictionary then we've also got a multi-word dictionary. And so the one-to-one dictionary was pretty simple. I found a bunch of different resources about simpler alternative dictionaries and things like that. And one of the challenges is foreseeing instances where a word may not work in another context. So replacing a word like ameliorate gets changed to help. So, but is every single instance of ameliorate easily replaceable with help? Like is that always the case? Cause it's a really bad user experience when a user sees a replacement that's not, that doesn't work in context. So building out, so I've done a lot of testing on those different words, like just a lot of thinking about like, okay, different circumstances where it might not work. And that works, that happens a lot with like phrasal verbs and things like that. So building out the dictionary was not super challenging but I used a lot of different resources to build it. And on the note of resources, we have a couple of similar questions. One about whether you wrote the program in Python and then what your background in tech was like, you know? So what skills, what coding skills did you come in with to build this and what did you have to learn along the way? Yeah, so the first question about Python. So no, it's actually all in JavaScript. So both the front end and the backend are in JavaScript. So the Google, so the Firebase Cloud Functions that I was referring to, that's what I refer to as the server, it's technically called serverless but it's the backend. So that's what it contacts. And then the Firebase Functions are what process the text. That is all in JavaScript. So that's a Node.js server. And yeah, it's all in JavaScript. Actually, it's not a great system currently. As I said, there are a lot of problems with it. I'm experiencing this problem called catastrophic backtracking with Regex where you've got a very large Regex and a very large string and that can cause huge time delays. And so right now I'm working on moving it into Python and I'm going to use a Dockerized Python app on Google Cloud Platform to, yeah, I actually have tested it and it's about 15 times faster. So it's great. And then my background in tech, I work at a company called Data People and we do, I basically work on the language guidance that we provide, but I've done a number of internal tools like developing internal tools like a linguist app and a few Chrome extensions for Data People. So yeah, we've got some experience but redefined has really, I guess, redefined how much I know about tech and how things work all together. Like I didn't know anything about what an API was or Firebase or anything like that before I started working on redefined so. How did you figure out what tools you needed to solve what problems? We always get a lot of questions about like, where do I learn about things? Yeah, yeah, so the very first, I think the thing is, I think you have to start with like a goal. So my goal at the beginning was, okay, I need an MVP, I need to get something working where it just can replace the text in any website that I go to. How can I replace text? So that's like first step. Googling a bunch of like, how do I inject a content script to replace text and things like that? And then you get to the next problem and it's like, okay, I've got this working thing and I want to be able to do a thousand words, not 20 words and I can't stick these thousand words in the extension. Like here's the problem. I can't, I need to be able to update the dictionary. Well, how do I solve that? That's required me to think about like, okay, well, how do other apps do that? How do other extensions and things like that do that? Well, they use an API. They use like a client server architecture. Yeah, I think one step at a time. I think that's really been sort of my theme here at like working on this product has been like one step at a time. For example, another case or another like thing that I solved during my time at Redefine was building the landing page, which am I still sharing? Yeah, show. Which, okay, so I'm not a designer by any means. So this is still like a work in progress by a lot, right? So it's not pretty, but building it has been like, I mean, I built it all and like I figured out how PHP websites work and things like that. So yeah, I think one step at a time, that's sort of the advice that I would give about where do you start? Well, you start with a need and then how do you solve that need? Well, what's the first step? What's the minimum viable product? What is in my case, I wanna simplify language for people. How do we do that? Well, an extension is definitely the best way in terms of on the internet. I'm not really sure about other situations like reading on a Kindle, I'm not sure how that would work, but reading on Kindle on Google Chrome, that would work. So yeah. So all these little steps to get there are those steps that you have been doing all on your own or do you have collaborators who you're working with? Yeah, most, yeah. In terms of tech, it's all been me. In terms of some of the marketing has been done by some other people, or like I have a co-founder that's helping me and yeah. So some of it has been other people, but most of the tech has been me. And same with the dictionary development. I do actually, I do want to, in the future, I want to sort of open that up to like, so that, and I think one thing that I wanna do is build like a sort of a rules-based system that maybe other linguists can contribute to or something like that. And I think that would be really, I mean, very powerful to scale the dictionary, so. And we just got another question in the chat about if you've thought about adding a dark, a dark-teamed reading mode to increase accessibility options. So an idea. Yeah, so some of those things. So that's, I've heard about that or I've had a request for that before as well as like text resizing options and things like that. Like how big is the text, that kind of thing. That's difficult to modulate on an internet-wide level. It's difficult because so many different websites are so different from each other. Yeah, I don't know. It's a challenge. Does anybody wanna chime in with other questions? Either put them in the chat or turn on your video and audio. Are there questions that we have for Andrew? I would ask a question for you while others are thinking about what questions they might have, which is, you know, so this is something you've been working on as a side project while you've been employed full-time. You know, how do you balance that? And for our attendees who are still students, you know, they might be trying to balance doing a project like this with their schoolwork. So how did you handle that? Yeah, so I think it's really important to strike that balance to not get burnt out. But I do think that it was, for me, it was sort of born out of this desire, A, to solve this problem and B, to expand my resume and to learn all this stuff that I didn't know previously. It's been a huge learning opportunity to me. And each little piece, like I said, with, like right now, I'm working on this, like I said, I'm working on this Python, this Dockerized Python container that will serve as the new API endpoint or serve as the new server. I'm learning right now how to, how do I actually deploy that? How do I use, or in other cases, like how do I serve or deploy a language model? And that's like a learning opportunity in my mind. And it will be more useful, I think, than just here in this moment for redefine, so. And we have a new question, which I think you touched on a little bit before, is there or will there be an option for users to share their custom word replacement list, maybe even for interesting or funny things beyond the original purpose for better or for worse? Yeah, yeah. So I think, yeah, I have to think, I have to think through more about what the product would look like then, like how I would, how I would go about having multiple levels of things. I've been thinking about like switching. So right now, right, you've got the two versions of, or like sort of two main dictionaries. You've got other, we have other dictionaries, but we've got this concept of redefine like public dictionaries. So those are like the ones that redefine will do for everyone. But then you've got your own personal dictionary. And I've been thinking about making those like different colors or like, I don't know, some sort of system like that. And then I think the add-ons, we could create new add-ons that could be user created. And so maybe that's the approach there. I have to think more about the, but I would love, honestly, I would love to collaborate. And if anyone is interested in, or has like thoughts about it, I'd totally be interested in working with anyone to expand that sort of like product thinking. So. We've got another question here. How would you advise someone to get over that hump of learning something completely new, especially for a language that does not have much technical or coding background? Yeah, so I think, so going into my current job at Data People, I didn't know a ton of, I guess I had started a, I did start a minor a long time ago in computer sciences, but I think the main thing that I would suggest is starting small and doing like one small project and then trying to expand on that project and to like make it better, make it faster, make it cleaner, that kind of thing. I think a good example, like for me, working on Chrome extensions has been very, very helpful because I've learned JavaScript, I've learned HTML, I've learned how websites work, things like that. So, but it started out very, very simple. Like way more simple, like instead of replacing text in a website, I just want to add something to the top of a website. Like how can I add something to the top of this website? And that's, I think that's step number one or even just creating your own website, creating like your own personal website, I think could be a very, very useful. And also, productive, like, I mean, you could put that on job applications, like here's my website, like here's what, some of the things that I've worked on and I think that can be very interesting to future employers. Any other questions that people want to add in here as we get to the end of our time? At what point would you put, this is a great question, at what point would you put a project like this on Resume or LinkedIn? Yeah, so I actually, I put it on my Resume, I didn't put it, I don't have it on my LinkedIn, but in it, I guess it's on my portfolio website too. I put it on there after I actually created a company out of it, so I'm the founder of Redefine, but it's like not, it's like a, it's still very, very small and early, so I'm not like making any money out of it or anything like that. But so like, I would do it as soon as you're like proud enough of it that, like to show off to other people. Like at this point, I'm proud enough of it to like show to other people. At the point where it was where it was replacing 20 words and it broke a ton of websites, and it still like does some weird things on some websites. So I'm still like working on it, but I wouldn't have shown it at that point. So I think as soon as you're like confident enough that like, oh, like somebody could actually find use out of this, then I think it could, I think it belongs on your portfolio or LinkedIn at that point, so. Would you have been comfortable in that early phase mentioning it as part of an interview process, even if it wasn't on your resume yet? Yeah, I think mentioning it, I think that can be very useful just in terms of showing that you're a curious person and that you will push yourself, that you'll like teach yourself, that you'll learn yourself, that you'll like work to figure out a problem that you haven't like understood before, so. I think that's, I would definitely echo that. I think that's definitely true. Employers like to see curious people who want to solve problems. All right, well, I think we will, we can go ahead and wrap up here unless anybody else wants to chime in with something. So thank you so much Andrew for coming today to tell us about your work and your experience. I think this is the sort of side project is so encouraging for people to hear about because they are getting the advice in other venues. If you should have a side project and you've been generous enough today to share with us how you can have a side project. So thank you so much and everyone have an excellent day. Thank you.