 English or something it's more like no these these terms don't make any sense okay not that kind of analysis okay not that kind of visual narrative we are more into creating you know a creation tool for we have a new generation tool right nice interesting because it's a spectrum it's narrative ways I don't know how many people have heard of narrative ways the treats are equal how many people have heard of narrative ways it might be interesting to improve narrative ways that's also the name of my company so narrative is no that was what I was thinking narrative science okay narrative science yeah but narrative visualization telling a narrative telling a story I mean that's the geeky way of saying telling a story is like telling a narrative yeah true narratives how many have done stuff on excel at least everyone serious question no no it's okay I mean faster than I start has like deep learning on excel right it's not like it's a functional it's a visual functional programming tool right that's that's how you define excel okay any other questions anyone I mean I'm just keeping everyone engaged okay we could actually start promise okay yeah sure the odd seat that's like the tricky seat I will fall over oh my no we can just fart it what but the mic is everywhere I don't want everybody shifting to the center okay so hopefully I will introduce and Uma will show up magically or definitely over here thanks so much to you for being here on a Friday evening and looking forward to have an interesting discussion part of the motivation of putting this discussion together was literally to create a lab situation to understand what are the real aspirations and what are the real anxieties with respect to math and stats I think like you rightly pointed out dear science does appear intimidating at the same time it's like the two buzzwords of today are probably like dear science and blockchain is that so also anything you're doing an open house is a buzzword okay that is what actually I was telling when was there a buzzword as he does an open house and tried to demystify it so I think part of our own interest in demystifying buzzwords comes to the fact that we are this traditional company that will be seen as a services events company which were not but that's how we're seen and people will often say oh so only those many people are coming to your conferences oh after the six years you have only 15,000 people who attend your conferences so where is your scale and there is your reach whereas the question of scale and reach is pretty diverse and it can't be quantified in terms of like your scale and reach but coming back to this open house I think the idea was really to demystify and to understand how much of math and stats you really need to know if you were to make a journey into the science for you know the reasons that are missed out for interest, for career, for job change career where career is probably standing for something much more difficult we recently concluded the fifth elephant in Antly inside two of our marketing conferences this year being the sixth edition and the anxiety and the excavation appears to be a lot more pronounced now and I'm sure we sort of understand why that is the case and I thought the best way of kind of like doing this is to literally go over the journeys of four or five people who have kind of like you know done this through different ways either through a new job or through having an academic background or I don't know I mean like various ways so we'll hear about that today and obviously the floor is open to questions and this is just to facilitate the discussion and then the floor is open to questions I'm very happy to introduce Amit who probably needs no introduction Amit Kapoor has been somebody who's been around with the fifth elephant since 2012 and probably even longer I have known him for 2012 and I think if there is a serious problem in math and neuroscience it is the lack of good teachers like Amit and I think anybody who's learned to come it will recognize how good a teacher he is so I'm not going to say much more and let me moderate this session Okay, thank you Zana So yes I mean this is kind of it's not really off the record but it's being recorded but it's an open session so even though we are sitting on this side we are by no means kind of trying to hog all the time so the idea is to have an interaction with everyone in the room I will kind of set it up both in terms of the rules and then kind of the broad questions that at least I think I wanted to kind of address through this right so I think the three questions that I really come across when I teach so let me say when I teach machine learning I've been teaching machine learning data visualization for the last five years not really seven years actually, seven years and I have a number of open repositories on GitHub and training repositories and the one repository that has the maximum number of stars touching nearly a thousand is called hacker math so there is clearly demand for this and I also feel that any time people come they have this burning question of how much math do I need to know and I see three kind of possibilities and I've touched upon that the first is when somebody wants to talk about I'm interested in solving a problem and I want to use data science and my fear at that point is really around or their fear at that point is really they've heard there's a lot of math in the world how do I, should I really get into this or should I stay far and the analogy I kind of use in that case is basically should I learn, I want to use I want to drive a car and I'll learn a car to go from point A to B should I learn the art of driving so use the tools to do it like the engineering of the car around it so that kind of math versus actually doing this you're free, I mean you can come from that side also the second question I wanted to address is I want to make a job change and I want to make a job change so how do I, how much math do I need to know and many of you are programmers and if you go to a job for programming even though you may not need to know how the sorting algorithm works in your real life or in the actual programming initially that is the kind of questions you come in a programming interview it's the same happens in an ML interview you get to ask a lot of math and algorithm behind it and that's again a one fear so I want a job in data science how do I address it and the third note that I want to kind of target is really I want to build a carrier and in that context it's not really so much about the question of do I need to know math because if you're trying to build a carrier you will eventually get to know math and that's what I believe I'm sure some panelists have a different view but what should I focus on how should I go deep, should I go vertical and you know that's the kind of three questions so interest, job, careers and what I want to do is basically ask everyone on the panel here and I'm going to try and speak as little I speak a lot so I'm trying to speak less and get everyone to kind of talk about their journey so you know like for five minutes starting from here talk about their journey and talk about when you got interested in data science how did you approach the math when you got interested or try to get a job and now that you are actually practicing it how do you think about math and I'm really hoping people will have radically different opinions so sharing introduction from you and then kind of your story on these three okay so I'm sure in working in co-works which is a start-up very sort of bangler so around two years back not even two like 1.5 or 2 years back that's when I started exploring this field so I was in the security I wanted to be a hacker yeah I mean the hacker that we see in movies but after I got a job as a security analyst I realized the profession is not as fancy as we see in movies so I was writing a security tool that I actually analyzed the typing pattern and authenticate you and I just used some basic data structure and put it in GitHub and somebody from Russia came and asked me to use some ML and then you probably get good accuracy so I searched for a community that actually meet together for data science things and I found a deep learning bangler when there and that's when I started so people asked me to go through some courses and yeah I didn't have much work in my old company so I put all my time into those like spend around six months then I start searching for a job I got a job as an editor and yeah so that's how I said and talking about the math as probably 90% of the people started I started from androgyny course from course setup so he actually gives you the basic idea and I guess that's good enough to good enough to start writing you can just once you get what he's talking about once you understand the terminology and the way probably the work even though you probably won't understand the exact equations and how all things turn and turns up you probably can start a career and get a job and I guess that works better you will have to learn anyways if you are trying to make a career you will have to learn it anyways so I guess that will come to you as you so pretty much kind of moving from one field to another by using ML as kind of a tool to do that so you're still in security or have you moved to yeah no designation as senior developer AI designation that part is clear I still remember I actually talked to Uma on the first deal build I made up I don't know do you remember me talking to you so she asked me today to do an interview course okay so that was a start nice okay so I'm Shanjan I'm working for Walmart labs in data science team so talking about my data science journey I'm an engineer so I graduated from IIT Turkey and I came to know about analytics it was my third year and I mean I did some small businesses during my college days and I was sure that I want to work closely or closely with the business side and so I spoke to my seniors that time and analytics was I mean it was picking up I mean but it was not that much I mean not many jobs were available in the market so but I got to know about it researched about it and I mean I liked it so I started reading a little bit about it and I finally got a job at absolute data analytics that was 2011 and the company is in Gurgaon and so that was a 200 plus employees company and I was into marketing analytics and my project was mostly into market mix modeling so for the first two months I was into lots of you know trainings and all and I got to know about analytics for the first time I mean from the inside I read about it a lot but you know after looking into it in terms of what maths is happening behind the model and how you can you know use it to translate it to business problems and how you can solve those business problems so that was like really nice and I mean that was the time I really hooked up into it because I was working closely with the business so idea of market mix modeling is to you know identify the effectiveness of different marketing campaigns, price and new products and all and there is a lot of back and forth between the business and the model because your model tries to speak business language you have to translate that business will also have lots of idea about how their campaigns are performing so they will come back to you and you know tell us that no the effectiveness of this campaign should be on the lower side why are you getting it on the higher side we don't see that much response in sales when that campaign was on the air on TV so you know that you have to understand that maths behind the problem or behind that model so that you can translate in a very good way so yes maths is important definitely from there I moved into a consulting role I worked for Dell for one and half years that was more about you know measuring the effectiveness of different marketing campaigns and now it was from FMCG I moved to B2B retail and it was a different role I mean B2B retail is not about above the line campaigns and it's more about you know phone calls and all the BTL activity and things like that so again there the problem was of a typical test and control study I am not sure how many of you in a test and control actually you try to form two homogenous groups and the only difference is the treatment that you have given to the test group so I mean it's very simple I mean if you talk about how many of you are computer science engineer and non computer science engineer so you will have two different groups then how many of you have more than two years of experience and less than two years so you will have two into four groups right so those are those homogenous groups within you and then I would like to make a comparison within that particular group I don't want to compare one group with another who are like not non-similar right so and here comes the concept of similarity and which is very much the you know background or background of most of the supervised learning classification problems I mean you have some kind of similarity algorithm running behind it right it can be a Euclidean distance or you know any type of similarity measure so yeah so at that point of time I realized that maths and status is important there were three other pillars as well programming is there problem solving I mean you need to be a good problem solver I mean that's very important and the third thing being domain expertise I mean if you don't know the domain you won't be able to understand whether the data is correct or not I mean that represents the population I mean so you need those components as well but understanding maths and stats what's going behind the algorithm you are applying to a particular data is very important because that will give you confidence to explain it to your business partners and that's where the game begins right you actually have to help business so that they can use it and implement the solution so yeah so I started reading lots of material online and I equate again that's a good course that will give you a little bit of breadth I mean understanding the algorithms the maths behind it and the intuition behind it it helps you you know develop that intuition what's happening at the background the other courses as well I would say NPTEL also has different courses linear algebra probability distributions so do explore that professors of IIT and IISC have you know lots of videos on that if you want to take certification you can do that as well but good courses and then you have lots of online available material you just need to figure out what where you are headed to and what are the things that are required to you know reach there so how much maths do you like how do you learn now that you are in your third job like what's your learning step yes I am I think I have the breadth now I need a little bit of depth so whenever I apply an algorithm I see what characteristics of what what characteristics are there of that algorithm and what are the characteristics of a data whether they tie up or not whether they are in sync or not otherwise there are hundreds of thousands of different algorithms and you have data and then you can apply all algorithms right but that's not how we should do it or we should approach it so hold that thought because I think we will come back to the career part of it because you are now deep into it and I think many of the people here are still exploring whether to even come into this area or not and if I let you speak they are going to run away very quickly after after more of this so let's come back to kind of the third question the career hold on to that thought why didn't you do the first question I worked for tribo hotels I lead the data science team there so my journey with data science started in the year 2014 and I was looking for a master dissertation project and I picked up one of the Kaggle competitions which was for text classification and that was the start and then I've always been interested with ML and data science so initially I used to do a lot of data engineering work around data science and then slowly moved on to understanding data science models and I always take the top down approach of applying some model and then understanding what is the math behind that so when I start learning something new not just data science even in regular engineering I always take the top down approach and the hackers approach of trying to understand the intuition behind it and then the logic and math behind that so that was the start of my journey and then I joined an analytics company where we were building data science based coach dashboard for one of the fitness bands and I was like one more army there looking after the data science part and engineering part as well so that gave me a lot confidence in applying data science to any business problem and I think that really changed everything and in my next job I was working closely with the customer service team I was handling a lot of data science and NLP with the CRM team for one of the e-commerce companies so social media integration and NLP on tickets and after that I had my own data science consulting company where we worked with a lot of businesses trying to solve business problem with data science and what I have understood that is that a lot of people want to get into data science but they are scared of man and my advice would be to just take a problem and understand what are the different ways you can solve it and as Ishaan mentioned there are hundreds of ways but then go a little more deeper and does this solution really fit the problem you have and then go deeper and understand the map behind that so that is my style and my journey with data science so the hacker way is there on the third side Vinay, you are? Hi, so I am Vinay Higre so I am an I am an independent consulting consultant and I have worked in a lot of different areas in my career with embedded systems then I have moved on to networks, large scale distributed networks and then at Inmobi I have moved into the data engineering side so my journey was at that point I was kind of unlike a lot of people on the panel I am not a data scientist being a data scientist is not my day job but having built a lot of data engineering systems is now one so all of this for what so one option is definitely to do the visualization which I learned on the job the other way to look at it is get insights out of data and in data there are two parts one is like a machine learning I would say the probabilistic part of it, the other part is the statistical so my strength is has generally been more on the statistical part so when I joined Inmobi the challenge was that we had lots of data coming and the first challenge was how do you kind of ingest this data process it then combine it, validate it and once all of that was done then the next question was now what it's nice you have built a system this stack is ready so that's when I started exploring first the biggest problem that you are facing is now that we have all of this data we had some 250 dimensions on which we could explore data how do we visualize it so that's where I came across ggplot2 and hence r and hence the Hadley was as it is known like tidyverse nowadays which is like a bunch of r packages which keeps which keeps you sane because lot of r can also make you insane because of the very weird I would say inconsistent syntax that is there and since then we have been solving problems mostly some of them in Inmobi there were some problems on forecasting later on I started working for another company called Helpship where I did a little bit of we explored little bit of sentiment analysis as well as a little bit of text processing and a little bit of moment of topic modeling there using Bhopal Rabbit and I think I was introduced to Bhopal Rabbit by one of our friends in one of the hacknites that one of the first hacknites that happened that has it actually and somebody said that how much math you did so the relation there was wow this is like magic because you are an engineer you are like okay I don't know how many of you have heard of ldn topic modeling the first time at least you encounter it does feel like magic and then you are like oh you know that there is no magic right now you are grown up so you don't believe in magic anymore unfortunately so yeah as you grow up first you believe in Santa Claus then you say no it's my uncle so you lose the magic and that's when I kind of started digging deeper I tried to understand the math and then it was like not very accessible I can say so kind of gave up on that the thing is that somebody else has thought about it and has kind of built those libraries so you can still use that and you can still not know the exact equations and you know how all of it works but at least if you don't know like a mild deep you can at least know like a few feet deep right and that definitely helps in terms of tuning what needs to be done also you need as Ishaan was saying you need to know you should have some understanding of the domain right for example in help ship there were lot of messages that came and we were trying to figure out you know what is the sentiment now you also have to understand in a domain that does so about help ship it's like in-app help desk and then you know we have platform where a lot of messages come in and then you classify those messages and you know route it to the right people should do sentiment analysis on such a data set it will be overwhelming negative because nobody sent you an email saying that oh you know system works perfectly you know I got my ticket yo like nobody does that most of the people are complaining complaining about some stuff that some stuff doesn't work or doesn't work I expect something is broken so if you do sentiment analysis on something like that you know and you show it to customers customers will be pissed like you know yeah but so that's where the domain understanding matters right the results that you get what to apply you need to understand domain come up with a hypothesis and then the hypothesis the last thing is like I'm not consulting with zoom cars there also we find we get a lot of IOT data again another buzzword I hope nobody's next question I hope nobody's saying buzzword bingo you've covered most of it so we get a lot of vehicular data say every car for example has sensors and you get data about things like how fast is the car going how fast are you changing brakes what gear are you in are you revving the engine things like that get a whole bunch of things and you can actually use this as inputs and then build models about you know is the driver driving well or is he within a certain threshold and that can have huge amount of business impact for example if you are over speeding the clutch too much and so there also you need to know the domain really well for example the car usage in a city is quite different as compared to car usage when you're going for a vacation right you probably have a lot more highway if you go to the hills then you'll use the clutch a lot more so knowing how the engine works how the driver how the conditions are helps a lot and I probably am there probably to provide contrast because I don't have a large math background but what I feel is like even knowing the basics like in statistics like distributions simple classification or logistical algorithms actually they can take you a long way in understanding but the biggest part is the math is not probably it's not missing that math is not the focus you can actually go from the inside out like I came in from more from the visualization point of and then I wanted to as an engineer I wanted to know how the machine works and you can always not maybe fully understand all of the math part but I should know how to use the tools and maybe a little bit background about how you can tune and the domain so should know just enough at least and you can build it over a period of time in terms of course recommendation I'll stop at that stop pause no recommendation okay fine let's just finish umma sir your introduction and just kind of your journey through the interest how you got interested in this topic and understood math how you learned it or how you learned it to get a job and then now as a career in LinkedIn or a data scientist how are you learning math required to do it so hi everyone I got my master's degree from IIT Bombay in 2008 I think that was my first serious introduction for maths I would say sure we all get good marks in high school and college but you realize that none of that matters so math at higher level is different so in my master's degree I actually got interested in learning machine learning or data mining and that's where I realized the need to seriously go into maths so once that finished I thought well I have everything so I started my job at Yahoo Labs and as it happened Yahoo Lab at that time had very good people in data mining machine learning and I realized how shallow my knowledge was I was working as a research engineer there I was building machine learning models or ranking models particularly for image search and although much of it was already there are specific algorithms that you have to follow because of practical constraints and most of my work was limited to just feature engineering given the data identifying which features to use for image search even at that level when we were given click data I understood that although click data was beaten to death whenever we wanted to construct new features that knowledge was not sufficient and I realized that although my peers who had who were in that field for a longer time who had better grasp of maths were able to do it better so that actually motivated me to understand maths better and I actually decided to go and to study for a PhD degree just because I one of the reasons was I wanted a better grasp on math so don't take the wrong message I am not telling you to do a PhD that would be too much but in my perspective if you want to go out of the box you want to do something different then understanding of math is needed otherwise there are a lot of commoditized data many packages, machine learning packages you can do well even by knowing those things but understanding how people build models what is a template that is follows how do you convert your data your constraints on that data into an objective function that you have to optimize again some of the keywords but knowing that template helps and once you understand that template then suddenly lot of things becomes very easy very simple to understand now given new data, new domain you can convert that to a similar problem that you have seen so for me math was very important just it was important enough to get me to propel me to get another degree so we have got one who definitely knows the stuff that they are talking and all of us on the other hand are just thinking so really interesting one who is trained in math a couple of people who at least pointed more towards learning on the job and gradually peeling the onion as I would say and two of you who really dug into it as a way of because that is the job you started to do and you really liked that part I would share just one from my side I also I am actually much more on what Oma also referred as learning the process so for me data science is not so much just the ML part for me it is really the whole data science process of how do you frame a problem how do you convert a business problem into an analytical problem how do you refine it how do you then transform it into something how do you visually explore it how do you then model it how do you then select a model how do you build something on top of it as a service and then you kind of build an application on it so this whole process for me is really important and even though I know the focus here is math for me that is really kind of one of the important thing to understand that what we are talking here is not just learning math for math sake we are really trying to say how do we solve a business problem in click whether a car is driving whether it be booked or not I can't say anything but yeah because I don't understand much or whether which grocery to buy or not so I think for me the process really and I think the question I would post so we will open it up but I think one question for me and the format we will follow is like anybody can ask a question so one question per person to stop it so I will start with my first question and couple of people from the panel if anyone replies then we will be done after three questions so my first question is really kind of feeling this part about somebody who has not done this has a business problem or has some problem in the a lot of people are interested in this because they think they can use this machine learning, data science AI to solve a problem how do they start how do they start learning that and in that starting journey do they really need to worry about math or is it ok to start without worrying about the math I am not saying without knowing the math without worrying about the math so I am going to call the mathematician to answer that also and then one of you also can pick it up and then questions whoever has questions just start raising hands starting off right here is really fresh no experience in this maybe a programmer maybe a media person maybe a storyteller I know there is tensor flow for poets which is can I play on that I am really anyone who wants to solve it I would say that as I mentioned if I were a person who has not done this before I have a new domain I have a new problem but I really understand at least the problem side of it that what is the objective I am trying to achieve so let's say I am in job search I want to show good jobs to people so if I can just put it in that manner I can find similar problems there is lot of research out there maybe we don't even need to go to research but maybe example problems are out there in some packages or otherwise I would try to find similarity of my problem the way I have defined it to other problems that have been solved before take it from there so I would say at least to begin with math is not necessary if I don't have the requisite background I can probably find someone to do it to start with I don't think you need to know math so I would say use it like a magic wand and once you use that it will be successful or unsuccessful if you are successful then the next thing you have to worry is how is this magic working and how can I make it better or if it is failing now you need to understand why it is failing and then go deeper so that is my advice so peeling and doing yes everyone is going to answer yes so the way I approve it I can talk about my process I am not saying that is the best process typically I if I know nothing about the math every time I work on a new domain first I try to understand what is the problem space what the problem space means what are the different components of a car what are the processes that generate the data that is one the other thing I try to understand is look at the data itself maybe just take the data and look at what are the various features for example look at the distribution on one axis is the distribution skewed is it flat distribution what are the outliers so understand the data set first explore that maybe visually which is why I love ggplot because if you throw a data set at me I will first go and try to visualize maybe ggplot maybe tab blue also and see how does the data look so when before I go to the math if I understand both the problem space like you said jobs or for example how do I tune a car how do I predict accidents if I understand that space and I understand the data set I think that is about 30-40% of the problem right there and then kind of as you said maybe look at how people in that domain have solved problems and then start digging from there so I like the car analogy you want to go from A to B you want to learn to drive a car learning how to drive I mean really you don't want to know how the carburetor or the gearbox works when you first start to learn to drive a car you need to first figure out how to go from A to B and if you can go from A to B and you can use the rggplot version of the car you can use the scikit-learn version of the car you can use excel you can use paper and pen doesn't really matter you use your car so you use a bicycle use whatever is there start going there and then at some stage you will start to when the car doesn't work or something happens you will say okay I need to know really where the engine goes or I need to maintain it I need to now know exactly what goes into it at some stage that thing will start there are so many knobs in the car I only use two of them that's really actually a reality you basically use two or three things in the car so you have no clue you can pretty much live but at some stage you will realize there is something called this factor to tune or I need to clean the wiper how does that work all the wiper is out of water I need to figure out what to put in there so that kind of brings me to this question I know a few of you referred to android engine course which is kind of like understanding the internals of it I have a very different view on this which is to really as you three mentioned start doing something take public data take any data problem that you have start making something out of it you know do it end to end this process of solving a problem and then as you figure out that actually there is more to it than amount when you want to solve a problem cleaning, refining visualizing all of that is going to be big chunk of it as part to the rest of it you will realize when you need to understand the internals and that's why I actually don't refer android engine courses for anyone to start that's like the no offense it's a fantastic course I'm not saying it's a bad it's a rise course to understand intuition and all I also do a hacker math workshop I don't recommend people to come to that till they have done three months of projects done few projects doing it before they can understand whether it's hacker math whether it's book math doesn't really matter so I don't know any opinions from yeah yeah I mean when I started so I actually thought the same way I took a problem like I wanted to predict the output of cricket match so I took the data from cricket for all the commentaries from 2000 and my opinion is unless you have a pre-built model which a code that actually written by somebody it's extremely difficult to make that work if you don't know anything it's like super difficult I mean like my task was like really simple because I have each commentaries listed which I took using a python scrapper and like getting the info from that commentary was like super easy but I could not get it work I mean TensorFlow had like a lot of examples in their website but I could not get it work that's when I thought I should understand why this particular line of code in TensorFlow is not giving me the answer but they are getting the answer for some other data set so that's when I went to CS2299 or the antherency course so that actually took me to a great extent but you were talking as somebody who was already in this right I'm talking about somebody who was just starting no no I'm talking about when you were starting this is what this was the okay so different perspective you know kind of go much deeper I'm going to beat one more yeah I would like to add so on the on the last question I would like to add that there is if there is no car then there is a cycle as well and had there been no cycle then you would have thought about some approach to solve that problem okay that would have not been that accurate but you would have thought about that particular problem as in how to solve it okay so that is called analytical thinking okay if you get a problem first you have thought as in what should be done to get the required results now then you get to know okay there is Google and who is giving you cars on rent then you google it as in or zoom car yeah zoom car maybe what are the different approaches to solve it it's the same problem and that's how you learn I mean that's where you begin so it's not that the data science is the when there was no data science or there was no formal word called data science people were still solving problems okay through axels through in pen and paper right so you have to think in that way you have to solve a problem what needs to be done that analytical thinking should be there and then you should think about other approaches that are far superior than the traditional approaches and that's how you will learn okay that's how we begin or many people have in the 90s that's what they did right and gradually they moved to ML and now we have computational power so we are talking about deep learning and all anyways talking about N2NG course yes it's a good course but at the same time I didn't start from N2NG course okay I started from the same thinking as the analytical thinking part we had a business problem I had some resources I had some peers around who were solving that business problem so they gave me a jump start okay gradually I realized that I have to learn more things about it so I started talking about you know I started with a market mix modeling that was more about a regression problem then I got to know about targeting how to target customers so I came to know about classification problem okay propensity model then I was like okay I'm not going to use words but clustering is how to you know create groups within data which is we don't know anything about it so two people with SPACs three people with SPACs two clusters so those were the fundamental things that I got to know and then I started you know building on it so bit by bit it's like day by day learning I mean you won't get success every day but if you do it every day then one day you will get success so that's how it is so let me beat on one more thing and then you add to that right so the other misconception of why math is really needed is because a lot of people starting off see Kaggle or these competition sites as the weight of doing that right and one of my again concern for people really starting off and looking at Kaggle is that Kaggle focuses on one part of machine learning which is really accuracy optimization right I mean it has enough interesting examples also but the focus is really not so much on it is on solving a business problem but on a very narrow part of the business problem I don't know I mean again people may have different view but my and that again you go there you kind of start to think that you really need to be doing hyperparameter tuning or you need to really be working on the last percentage of accuracy to solve a business problem while in real life or at least my experience that's not may not be needed right so and both of these are fantastic tools I'm not kind of denying it and I'm just beating the popular artists to kind of get a response from the panelists yes so one course I would like to recommend which I found really useful and I keep going back to even though it's fairly basic is Uda city statistics one over it's actually really well thought for somebody who's starting off the Sebastian through Sebastian through one and definitely I mean there are more courses but I think that is like really good because it you can relate to it as somebody who doesn't know a lot the other example is like from the tool side right like from my own experience I've I always in some sense at least initially we used to look down upon Excel right but I have realized that Excel is a fantastic tool a great example is like at in movie we built all of these like what a Hadoop cluster and you know like again buzzword bingo but we built all of these like different tools real-time stuff and then when the analysts actually started working we just give them an API and all that API did was it would run a query on this large whatever 4 billion records we used to get per day and maybe sometimes they used to run it over 7 days 30 days it would turn all of the data and give them a zip file which the Excel you paste it in the Excel and it would unzip whatever the results of that were and then the models were built in Excel including visualization and that is when I like my eyes like got really opened up to do a really a lot of fantastical work in Excel a lot of people don't know though if you can program in Excel you actually know functional programming but again functional programming when somebody approaches it it feels like wow this is magic and like yeah so I just wanted to kind of bring out that aspect it's not so much about the tool but knowing about the problem space and knowing about and you can do really really fantastic stuff just using Excel you just need to know need to know how to use it well I can attest to that I spent 10 years as a consultant management consultant and my tools of trade were Excel and PowerPoint and you can really do fantastic stuff on Excel okay before I kind of move to the next or more questions anybody else has a question anybody have you can ask yeah so I actually there's problems take a lot of the really fantastic ones I think do you also better off and have this distinction of data science what is ML what is data analysis because we're just throwing it all around and just mixing it all together but do we better off people understand because ML is part of it but when you was going to say data engineering what is it actually you know yeah everyone keeps talking about all these bus words but I don't think we are able to get out of it you know but data science because of you could be a process if you could be a technique if you could be algorithm probably just to give a distinction so from a very basic part like I'm not like the question is for you yes yeah so the way at least I look at it is one is obviously statistical part where a lot of stuff is like what happened in the past and probability is like looking at data and kind of predicting or figuring out what might happen in the future or what is likely to happen so that is one the other distinction I found really useful is supervised versus unsupervised right supervised is and people have used a supervised to put in a very simplest and supervised is you know something and then you using what you know like label data and then you're kind of trying to figure out similar data and trying to figure out what you probably what you know figuring out either classifying the data set or doing something on the newer data set which is similar to the data set which you already have figured out and labeled by the by humans or using some other algorithm I think you've already gone to the I'll just complete and unsupervised is you don't you have a data set and you don't know anything and you want to kind of figure out like I mentioned topic modeling I do I just throw maybe like corpus and maybe topic modeling and tells me something which otherwise could not have figured out right so largely I feel that these are and again then you can go deeper and deeper but so let me try and address that in a slightly different way so process so there's obviously a process part to it I mean kind of solving the business problem all of you are taking a business problem and converting it so you frame the problem they're basically five five or six kinds of problems that you have right one is kind of very descriptive in nature I want to describe the data I want to see the shape of the data I want to see the trends the patterns the outliers in the data it's called descriptive status they thought descriptive or exploration right so I want to explore the data so that's one kind of problem I'm combining the two the second is really inquisitive in nature where you're trying to you have a guess hypothesis and you want to test whether that's really true or not right so let's take an example of let's say lone defaults right so people are defaulting I want to see whether the lone default trend is going up that's descriptive in nature right I want to see the trend I want to see whether older customers default more that's a hypothesis that I want to test that's inquisitive in nature the third bucket is predictive I want to predict whether somebody will default or not right so that's kind of predictive in nature the fourth bucket is causal I want to understand why people default so why do customers default right so that's kind of the four I mean there are actually six according to science but four kind of buckets you can think of descriptive as one inquisitive predictive and causal most in business we talk about is either of the first three kind causal comes in much more harder kind of problem so that's kind of on the problem side now we need to solve these problems in some way we have many tools to do this we can do using many tools we have a process for doing this the first part is the framing part making a business problem into an analytical problem that's the thing then is the data wrangling data engineering part that many people have referred to which is getting the data in so acquiring the data defining it cleaning it and shaping it into a way that you can transform it so that's the data engineering part then you have a visualization part which is the exploration of the data I want to see these problems then comes the modeling part to solve this problem which could be just simple statistics it could be just counting which is called the frequentist approach it could be probabilistic which is the Bayesian approach and if you do frequentist at scale with a lot of machines behind it it is kind of the ML approach that's the modeling part once you've done that you can answer that business question again so translate it back into an insight right and then once you have that insight and the answer you want to communicate it now whether that's an application some people said an API that people have accessed may be a dashboard it could be just a part communicating what you understand doesn't matter there is an output to going out if you build an application then you have the whole kind of deployment bringing it out as a service part of it if you have a dashboard that means linking it to a visual exploration tool that people can explore and get the insights could be an algorithm right so you can think of that that they have a problem I want to wrangle it, clean the data I want to visually explore it I want to model it using some techniques and then I want to communicate the insight either as an algorithm that's the process these are the different chunks about it what Vinayak was referring to is now we are going deeper into counting the frequentist approach when we count if we know the output is going to be something we call it regression if it is continuous if it's categorical if it's classification it doesn't matter we are now counting stuff we want to build a model around that now that's where I guess the math question comes math comes everywhere math comes also in cleaning visualization is a grammar visualization math comes also in the wrangling part how do you handle data how do you hash stuff, how do you store stuff how do you access stuff over single hard disks so that's kind of one way to think about it the other way to think about it is how I mean I know I can keep going on so that means stuff but does that help answer your question about how are we thinking about it so I just wanted to let you know because I am a decent man I am not what I am saying is that let it not be a barrier to exploring stunts the math can be figured out and as I said maybe you don't need to understand all of the nuances of something but that doesn't stop something from being useful that's all just wanted to clarify that no I think that's clear hacker approach and I think that's also perfectly fine I actually like the peeling the onion kind of approach as you learn more you learn everything more more tools, more math, more domain everything goes together in one way more questions from the audience otherwise I will ask yes go ahead your name just back now my name is Abhash Gurwal and I have around 9 years of idea and I am working on my own startup as of now now I am giving you a business problem and I would like some answers from you how do you want me to approach I am pretty new to ML I don't think we can do consulting long the fly if you guys are facing that kind of problem like even some examples like vehicles problems so you have shown that we are getting some sensor data and we are analyzing in a similar way just high level how you actually do that I would like to pause because I want to focus a question on you want to solve the question are you scared of math to enter this field or you are okay with it yeah I am good okay but then that's the topic of the question we are really understanding how much how to go deep into it not to solve the problem because I am sure each one of them spend hours trying to solve your problem I am not asking you how you approach on the math side you can just suggest me I think yeah let's take it at the end of the session let's take it at the end because I really want to focus it on math because this is a big topic right I mean we are not going to do I think all of you mentioned the term model I am not sure how many people in the audience understand this but what is the model what is the model okay what is the model it's a recipe recipe to solve your problem a little less subscribe please let's say I am doing ranking okay so I have a web query I am ranking documents in response to it so when I say I am building a model for it I am actually specifying a way to match the query to the documents and come up with some kind of score which will actually allow me to rank the documents another example suppose I am doing spam filtering so given comments on LinkedIn spam now when I hear when I say I am building a model for it I am again specifying a recipe of how you can come up with a score which will allow you to identify whether it's spam or not so you can think of it basically as a mathematical function which will learn your data that's right yes I would like to take yours one more time so first of all when we say model in very very lame in words I would say we try to replicate what data we have what intuition we have in terms of mathematical equation so view in some sense it's also view of the world like how you look at the data set and then based on what you have learned from the data set you give it one set of inputs and based on what it has learned based on what data it will give you an output so it is in some sense like maybe for a lack of better word or a recipe is a good example or a transformer you can say especially if it's a classifier takes one set of inputs and in very very abstract term gives you a label data or maybe on the other side if it's a classifier kind of thing so yeah what have you been abstract you said if you were being abstract but what if you were... mathematical function is probably the most can we call it as... I mean the programmers call it as algorithm and like the data set is called as model another way would be most of the times what we do is we actually test our hypothesis so anything that helps you test your hypothesis is actually a model and so we try to think of a hypothesis like we try to see that this is happening because of this particular thing so we try to form an equation to prove whether it's happening or not and up to what extent we want to quantify that relationship the cause of the relationship so in simple words any statistical hypothesis could be called a model I actually like the kind of recipe kind of way of thinking about it I mean you think of I think of model as rules so you know you can have heuristic based rules and I'm going to say who should I offer this going back to alone let's say if somebody has an income of 50,000 I will give him a loan otherwise I'm ok right so I have a rule and I can have many sets of these rules many sets of recipes that is there to kind of make this right so that's a rule based or heuristic based what we try to do is because these rules are brittle they may change because people may change the situation may change we don't want to generate these rules manually at least in the ML world we want to generate this rules from the data right so that recipe needs to be generated from the data etc that's what we trying to do at least when we trying to kind of automate this or use ML to do it so what is model model is that representation of this rule which is that and that has basically kind of three conditions that there is a pattern that exists because a pattern needs to exist for me to find it right a pattern exists I cannot just create it I mean I cannot just create it myself I mean I cannot just learn it myself I can learn it from the data so pattern exists I cannot mathematically do it I mean a simple rule based can be done mathematically I mean I'm now going into kind of the ML definition but that's kind of where it is the pattern exists it cannot be just a mathematical transformation but I can actually develop a function to understand it and I can learn it from previous data or some kind of data and then once I build that model if I get new data I can learn something new from it this is kind of more ML predictable but I like Sharins also algorithm like sorting there are so many sorting algorithms but I think for me algorithms are more like families of tools you apply in the sense that that could be a family to solve it just to give an analogy of the programming language right programming language also takes a set of inputs converts into another and every programming language imposes a model or a view of its world on team like if you use a function programming language it has a view of the world that everything is functions and then functions work on data imperative program is quite different so similarly model is in some sense I would say a view of the world as well as maybe representation of what the data of the patterns in the data in a workable way right in a workable way in the sense that if you give it a similar kind of data it will give you a it should at least give you a similar kind of output right so yeah I mean I can actually support my argument why I call it as similar to algorithm so for an example let's say we are doing an XOR operation we can actually write down the algorithm right so you can actually build a neural network and that neural network actually the model that you build I call my neural network as my model the model actually learns what this algorithm it ruled that to implement so yeah so this is the reason I call I mean I actually would like to see so the reason I would disagree with the algorithm is you know how an algorithm works but you may not always know why a model works yeah basically the model is trying to learn what to implement because algorithm is a series series of steps right but you have this machine where there may be multiple trapdoors and you don't know which trapdoor will open when or which input will get fired when or which path will be taken which is why I feel that algorithm is maybe at best a partial analogy in some sense so basically I mean even when you write a model yes yes yes for probably maybe like maybe to some extent for supervised learning but maybe unsupervised what we call the model may not the algorithm may not completely explain what we are trying to do I mean it's the simplest questions which are the hardest to explain yeah I am not there but can you just add something basically I mean if I look at it in pure English right a model is basically let's say you have a question and you want an answer out of it the process which converts that question to the model so if you have a black box you have the input you have the output the black box is your model it could be an equation it could be a sequence of steps you follow whatever it is that is your model it can be a domain expert as well right whatever it is yes expert based model yeah true we all have to go back and come up with a better succinct definition of model so for me the different between algorithm and model model is instantiated with values particular values so let's say I just say that you match your text data the query data with your document data that's an algorithm that's a series of steps but it doesn't say how you match right what should be the weight given to each feature it doesn't say that when you actually assign those values and have a concrete steps which can actually be coded then that's a model before that is an algorithm that's my equation and to introduce a new jug and learning that weights is all about machine learning like so yeah having more mundane questions more rudimentary than that and if it needs out of context it needs out but given the ways of the profession or anyone like that why would one let's say focus on data I don't think everyone should to be honest I really don't I'm surprised so many people turn up here on a Friday evening on a Friday evening what makes you think like the data science field as such is so I can probably answer that let's say as you can see for example if anyone wants to add this query or make it complete there's so many options around there in his query so in that so many like the wide variety of options one can go to see whatever that is so what makes it as I told you if you think it's complete no no no that's fine I mean it's a valid question people want to get a job many people are trying to get into data science now why should they it's the question I think not really that what I'm telling you so as an engineer personally like I said I'm an engineer so it for me it's another tool in rsten it's the same reason why I would learn maybe a new programming language or a new framework or a data science though I think data science I wouldn't consider it's a very large enough field by itself but it's another rsten in the tool like if you go to like the real world which is like the kurukshetra and your arjuna you need all of this like and arrows to hunt down the problem right so I feel that in my viewer of arrows it is another one like just as the way I maybe look at visualization the same question can be applied to other fields also why do I need to understand visualization it's a way of solving a problem and regardless of the algorithm so maybe if you are really interested you can become a specialist or if you are a generalist like me you need to know enough like at least like a few feet deep at least to know what can be applied so I feel in that sense it makes sense if you want a job yeah why not I mean like it is there is so many people working here you can become a generalist is it like how it's not just as everything else in life it's not necessary to like no data science but yeah if you are interested yeah you can make it like a full time profession maybe you can know even like one specific thing very deeply it's all up to you sorry for the philosophical answer I think for me a lot of it is out of fear like it's like I see a lot of people come and say I want to become a data scientist I said okay I am doing a workshop three months later he said no I want to start tomorrow and I am like okay there are so many other options pick your learning style and go learn if you want so for me a lot of it why people are trying to gravitate is is kind of driven by either the smell of the golden pot at the end that they see or the fear that my current job may become more redundant because of automation or stuff that's happening that is what I have an opinion right so for me as an engineer as engineers we always try to automate something and the power of data science and machine learning is that you can automate things and make processes more efficient and with the whole world moving towards automation I think there's a lot of fear like Amit said that your job is at stake right I think and that's fine I mean I mean learning skills I mean most of us in life will now end up doing two or three different careers and it's fine if you and people end up picking because we're in Bangalore those are the kind of jobs you'll get and you know because of the environment and you end up there sure I mean that could be one but I really say if you are interested in it if you want to look at the world from a data-driven lens and you're curious about using that data-driven lens to understand and also understand that there's only just one lens then dive in like all of us here and why it's probably is important is for a long period of time I mean for us at least for me who has been working software for almost like couple of decades we never had so much computing power we never had so much memory so Moore's law has also taken over and now what was data science was always important though it was not called data science but we never had the computing power to do that neither you had so much data at like today like I was talking about sensors, sensors generate so much amount of data so what was it was not even possible in like 256 kb of RAM today I think a lot of us have computers in our pocket called smart phones that are much more powerful than maybe what computers we had on our desktop decades ago right so that is also there like it has become more accessible both in terms of computing power in terms of the variety of problem in terms of data generation so probably I feel that is a reason why it has become relevant and also there is like the fear factor right which is like will I be irrelevant but I would caution this is just one lens and looking at the world just because you figured out so many sensors and they keep generating data and you will have so much data there will be problems to build on it okay that's still one thing because you cannot measure everything in life and yes there are more important things to life on a Friday I would like to add one thing had there been no patterns there would have been no data science but there are predictable patterns and now we can actually predict it and hence we are actually moving towards data science and hence data science is touching all the different fields and you know machines are you know if there is a pattern which machine has read then machine is not going to make a mistake but the human can make an error over there okay and that's another thing because of which data science is touching each other things so patterns is one important thing one more I mean he is definitely on the convert side I would say I am much more on the skeptic side I am also on the skeptic side I mean we don't need to look very far just look at a lot of you said you use Facebook look at the translation that Facebook does there is a separate community for example Facebook translation but sometimes it's so fantastically bad yeah I think the whole thing is so there is like I think we are in the way from the topic but a lot of people actually talk about singularity and other stuff and as I said I am in the skeptic and running the world but there is a billions code which says we underestimate how much progress we will make in 10 years and overestimate how much progress we will make in one year right so I mean I think we will solve translation in one year maybe not but in 10 years who knows right so there will be progress but you know right now I think we are on the high cycle on the top so we overestimate how much we can solve in the next one year data science will get me I don't know we will not probably solve a problem but I can get most a few jobs for sure that's kind of my next question somebody is going into a job in data science oh this is recorded sorry you know we can answer I think also a lot of them already in data science and a lot of people here are coming here with their colleagues oh right I can see you are asking for a friend a safe person girlfriend expertise right girlfriend expertise right there so asking for a friend if somebody wants to interview and they want to get a job in the so called data science where you know the job profile will probably list 15 different tools all possible you know data ranking data cleaning, AI to excel to you know spark, Hadoop name it right and everything so there is a large list and assuming they can make their way into the interview I feel like there is this inevitable question right like write bus in tens of right write you know I tell me why how perplexity works in TSN doesn't matter I mean I am not doing bus work I am just saying you know I will be like why does regularize there will be some math question behind some algorithm that you know you have been turning maybe the knob but you have no idea what happens when you turn that knob right and something happens so how for people you know getting into this kind of you know booking a job, interviews and I know all of you have jobs and you kind of got there how did you manage or what advice would you get right starting from the math expert yes so since I work in jobs I do have some background you know what kind of jobs are there what do they ask for you probably ask those questions yeah so at least in my company there is a so there are two kinds of problems and one requires you to have math background but that is probably more of a scientist kind of a problem and the way I have seen data scientists when people put advertisements for data scientists that is more for knowing the tool boxes than around them they don't expect you to know math behind it you should know programming how to apply those tool boxes and know which tool box to apply what kind of algorithms are there what kind of data if you are going on large data then you should know how to pick so that way you don't need to know math you need to know more of programming okay other people who have been on the other side of recruiting I have the same opinion as Uma when we hire what we look for is people who we work mostly on CNN so what we look for X3 boost is the only word okay you need to say that yeah so what we look for is not the math I don't remember what I asked any math questions till now so what we look for is how they could understand the whole network work and how probably if something goes wrong they would debug the network I mean yeah yeah I never asked a math question why be good if both of them are the only people interviewing any other perspective from interviews you've given interviews to have taken I think the best way to give or take an interview is to give a data set and ask the candidate to solve understand his thought process around the solution and then maybe a bit of math of knowing how these models work and if you want to you know tune these models you need to know the math right so ask those questions so being on both the sides I think this is the ideal way how many times do you see this ideal way there are a few companies who does this but a lot of them really take math to the next level so for me again it really depends on the job role right as Umar pointed out there are different kinds of data scientists position out in market some of them are research scientists some of them are on the business side of solving problem and then there might be an intermediate between these two so you need to choose where you want to be and grasp your skills based on that so this part of the question is a bit like how do you acquire a full stack engineer yes so there are two problems right again there are two categories like you know back end and then you are thrown like me you know back end really well and you have done something in jQuery and you call yourself a full stack engineer you really are not so there is only so much time and you can be good at only so many things so as everybody was saying like there are choices you make so as she said one is the data engineering aspect and the other is the data science or the machine learning or what can this other math aspect so probably if I were to have I would probably look at where in the spectrum do you like and is it where does the job lie in that spectrum as well right so it has to be a good match so you might need in some jobs a lot of data engineering skills not so many so much math skills because maybe there are experts already right but you need to know enough so that you don't do maximal damage no it's true I mean it's a fact in other cases you might have the data engineering figured out and it also depends on where the organization has evolved like giving the example of in movie right a lot of us came from the data engineering side and we were just pretend data scientists right but as we got more sophisticated we realized that we are hitting the wall and then we started getting people to that's how I learned getting people who knew better than us like they would tell us like you are an idiot like yes tell me more so that's that's how you learn right so it's like I would say like where does the job fall in that spectrum and where do you as the applicant fall in that spectrum right so once I tried asking math this is an embarrassing admission but I started asking math and this was a guy actually I met in fifth elephant also later and he schooled me on how classification algorithms learn so I realized very very fast that you know I was running to the limits of what I know so that also happens right so yeah so basically that's what I would say like where in the spectrum do you fall in terms of the job and how in you as a candidate want to fit in like sometimes people have very high aspirations like everybody wants to college they want to become colonel engineers right or if you are into physics they want to become not just any other physicists but nuclear physicists right so similarly I think data science is at the height of the hype cycle so you need to kind of figure out where in the spectrum that you learn okay we take a few questions from Twitter and then come back to the two heads so there's one question to send okay so we go back okay first you and then you yeah hi my name is Aram I'm also an independent consultant so my question is not from the job perspective but as a hobbyist so let's say that I'm a hobbyist and I want to explore let's say the data sciences and all of us have different level of understanding of mathematics somebody who has done a CSDT might know a little more about linear algebra or discrete mathematics and all but others might not now if I want to just test the waters and see okay is this my cup of tea because it might not be everybody's cup of tea so are there certain kind of problems that you can kind of just play around which are kind of the standard problems which everybody can just explore and say hey I'm spending like my free time for two months and just playing around to understand that how can I does this even look is it as glamorous as it looks like or it's not made fun so are there any such kind of things that people can go from I have views but let's ask so people want to dabble into this what's the best way to dabble into this right I think the best way is to get your hands dirty I'm sure there are a lot of data science, data science courses online but I would recommend you to get your hands dirty, read a lot of material I'm not a big fan of watching videos because whenever I try to watch a video I sleep off so I read a lot that way there are different kinds of people so what suits you just follow that but I think just get your hands dirty with a data set try exploring it with any tool of your choice it might be excel or pandas or anything but just get your hands dirty and then you will figure out what you need to do there are a lot of public data sets that are available UN has a lot of public data sets and if you just google public data sets both within India and outside in India there are so many twitter accounts the way personally I learn is that I follow people who are into the field and keep on tweeting interesting links and interesting data sets and that's how my learning is driven yeah for example I think not a lot of data but the government has a lot of maybe say economic data that you can use for possibility of prediction Kaggle I think multiple people have mentioned has a lot of data sets as well UN has a lot of data sets and there are certain twitter accounts that actually tweet out public data sets I can answer that data is plural data is plural even reddit actually there are a lot of communities people actually ask saying that I have a data set and this is what I want to do but I don't have a skill set so reddit data sense communities on reddit also have a lot of will point you to a lot of they are like pin threads that you can look at one fun fact here you can really use open data to get a lot of benefit and when I was part of my consulting company we signed up for a bankpiper bankpiper is the Bangalore Python community group and bankpiper we signed up for a bankpiper talk and we didn't know what to do we wanted to show the power of data analysis and then we thought we will simply use one of the open data sets available there we went to open go dot data we picked up one of the healthcare data set of Tamil Nadu of how many medical supplies are used in the government hospitals we did some nice data analysis and went to bankpiper spoke about it and one of the person in audience was CTO of a marketing campaign company he was really impressed and then later he got converted to our client so it can really benefit you the data science field is super rich right so you have different categories inside so if you are got excited because of google translate or deep mile alpha go then the data engineering field might be super boring for you right so pick the field that you are interested in inside the data science and probably if you explore you might that will be more interesting just to add it last you can also think about hackathons and all which are open and you will get kernels from best of the data scientists you can check what you have done and you can benchmark against the best of the data scientists who have worked on that data set so you will get to learn more from that that's also another option I have a very simple approach data science whatever is a very lonely endeavor I mean it is both it is a group sport but it also a lot of time spent doing work right so my thing is pick a problem that you are passionate about because you need to spend some time doing it and then as many of them have said try and solve it in whichever way you can whatever is your skill set and leverage it is all data science and doesn't really matter try and pick a problem that you are interested in and as part of solving it you will figure out how to do where you need to go deeper to learn more tools to learn more frameworks or algorithms to learn more math if that comes into I think open data is a good way so any problems similar to Raghavatham we have done you know different kind of analysis in bankpipers for example we have done like weed pricing analysis fire and dance and kind of game of thrones onion prices onion prices my favorite the great onion crisis of 2010 where onion, iruli, kandha everything was so expensive and mine was so expensive so those are the topics I was interested in and we investigated part of open beta and you know build tools we can do it in any kind of but you have to spend time to do it to pick something that you are interested in and then pick a learning style that works don't watch videos if you sleep if you don't read or if you like to be this kind of game player or like high pressure environment hackathons as he mentioned is a good way to kind of do it at least create something so figure out your learning style and kind of do it one last comment I want to qualify is that if you take out if you are interested in a field you take away one dimension which is important is domain expertise if you are interested in a field you are most likely to know a lot about the field which means you are more likely to succeed because you are more likely to succeed or at least get direction that you can go deeper from a domain expertise also so even though you may not solve the problem you will learn more about the domain at least so maybe pick something in which you are interested so that is a domain expertise angle kind of is minimized and I think one thing that I would also say is focus a lot and I think we have been talking on the communication aspect so ultimately whatever you do in either both the understanding of the problem as well as communicating it involves people I know when I said it is a lonely endeavor in the middle but it definitely involves either understanding the problem it means talking to people, communicating making change happen or deploying an app doesn't matter whatever you do involves talking to people right so kind of focus a lot more on communication and how you talk about this stuff so as homework go back and figure out what a good definition for models and similarly just figure out how to communicate that in a succinct way to people who will not understand this right and we are all trying to we all are at different levels of our own understanding and trying to explain that to the best are there any math questions which we have not talked a lot about math second question then after I have two years of experience as software developer my job is not as interesting as it was two years ago I think I believe I can solve better problems with ML and data science so which way is better have an academic background or to learn on the job right so learning path which way would I learn better faster faster that is tricky it is tricky question that is also very tricky question that is also very tricky question everyone of us have seen the variety people have like I also personally if I could afford maybe so much time up I would probably go back to classroom learning but a lot of us can't because all of us are working right so now they have tons of resources which were not there earlier for example there is Udemy we talked about Udacity so figure out what kind of works for you what works for you could also be that you change the job in which there are good experts in a certain area that you are interested in and go there other way could be like go back to the classroom learn the basics and come back right I would say if you are rich enough to survive then go for higher studies it is not about because it is not about the monetary aspect it is also the time aspect which you have to see because all of us you can see the diversity let me frame the four parts there are four parts to learn this I think we have talked first is take time off and spend some money and go to an academic course the second part is kind of doing taking some time off for a shorter time doing a boot camp kind of three month collapse the next part is to kind of do it on your own but you know either through a professional degree but that is not full time but part time courses but still in a classroom kind of environment right the fourth part actually there are five now that I think about it the fourth part is to is to learn it on your own just while you are doing and I guess the fifth part is kind of learning it on a job trying to find all of this right they all have their own tradeoffs in terms of time you will spend to do lots of money and how you learn so you have to figure out what works for your learning style if classroom works, if peer to peer works you figure out what it works I can assure one thing though none of them is going to be fast right I mean each one of them is going to take I would say there is no like like Peter Norweig says there is no 30 days to learn a programming language there is no 30 days to learn data science actually elongate the time on which you measure yourself your learning if you put a two to three year mark then you would say at the end of three years you know a bit by fast time and which way would be efficient like I would learn more without so that again four parts five parts pick what works for your learning so I mean you can only answer that first of all figure out where you want to head to what's your fit and think about five year down the line where do you want to be for that how much do you want to learn and to learn that much what needs to be done do you want to get into a classroom course or do you want to learn the skills and then switch jobs which is more towards your final goal which is like where you want to head to five years I mean I think none of us can give you the best answer you will have to figure that out the answer is within I mean it depends you will have to find your fit okay first head and then we'll go back and then there's one question on twitter also so I will take that my question is more of so I observed that people have been in the panel have been saying that it's not necessary to know math to get started or to get into a job and I completely agree with that aspect but I feel like we haven't enough spoken about why math is necessary what would you miss out if you don't know math and at least my perspective on this is that the further you want to go knowing or having a solid basis you don't need to know everything how it works but having a good solid background let's say this probability or statistics if you don't have that it will become more difficult when you want to go deep and if you need to understand that it's hard to just go and suddenly learn stats and come back whereas you can look at documentation of how to use an algorithm and use it but so I think that perspective if somebody could speak about it what do you miss out if you don't learn math when you could have the simplest answer is you'll be stuck in a local math to put it in a data science point of right like you have this large space to explore and what you'll miss out is yes there are better ways of solving problems right like I use Arjuna and there are so many better ways of solving problems and if you don't have a solid grounding it will hold you back right so I think people should not have any illusions that I can get by without math and knowing the tools yes you might be able to get started but you won't be good at it which is true for a lot of things in life I think once you're building a career then it's not unlike anything else like whatever is required to learn you will go deeper I don't buy this that you have to learn everything at the start I mean that is the academic way but knowing like basic like statistics that distribution like he was saying Bayesian probability at least you should know maybe like probability 101 statistics 101 maybe there are a lot of these courses or maybe one or two I would say so even when you go and read the papers at least you can make some sense out of it right so that definitely I feel that everybody should make an effort my only question this is the only difference between being like a technician like when you start typically a technician and then you slowly become like an engineer I mean if I were to use the terminal and then eventually maybe like a scientist right so you start off with the tool which is like a technician like I can drive a car but I cannot I am not like a rally driver I need to know so much about all of those different aspects of driving like when to actually make the turn or you know how would a certain certain part of the course would look so it's the same thing like in driving a car you can learn how to drive a car but you won't become a rally driver or a formula one driver it requires a lot of skills no so my only caution there is like people who are doing it like so you talk about people who are doing it already then then you know they are learning all aspects of it more in engineering more in math more in business domain everything right when I said when we when I said we don't need math for people starting off is what I see people go and do one course stats property or one ML course and then they say this is too hard and then they never go because they never solve a problem when they are starting off so that's why I am at least from my perspective is yes you need to know but if you take the course based approach then what happens is after a few courses you are like done and you never get started once you are building a career then you will learn more in each aspect of it even how to start a server like you will start everything right like I am a huge fan of Richard Feynman so he has this beautiful book and talk called the pleasure of finding things out like if for example even if you don't know if you look at a flower you even if you don't know anything about physics like you know how right is reflected and why the flower is like why the rose is red you still enjoy the flowers you can so you can still enjoy the blood but knowing one level below always has like why is the flower red and not like blue or black or whatever right so that doesn't stop you from appreciating love but you gain a much deeper understanding of this I agree to it in the beginning but yeah yeah you can maybe you want to share that did you did you find a point where you said I am glad I know maths because I would not have been able to go further yes that's exactly what happened when I was in the Yahoo Labs and that's what motivated me to go back and get a degree but I think getting to that point was very important yeah knowing that other people are understanding is better and I am not getting to that frustrating point was required to go through math courses yeah I think for me also it was like I hit a wall and I said okay I don't understand any of this like for example why how should I do sampling then I went back red stats 101 because unless I understood sampling a lot of the stuff that I build on top becomes totally useless so you need to know even the basics need to be there so you can actually work from the fundamentals because you can actually if you can build from first principles it's important and that's why I think knowing math is important so one more thing is what I see is that a lot of people fear math that is because the way math is taught is totally wrong it is only equations everywhere but then there are few resources online or books which you can refer to which beautifully explain the application orientedness of any concept and there is one book which I recommend you all to go back and at least get a preview version on google books it's called a mathematics course for political and social research and he starts from basic calculus and he explains the need for limits and what limit n tends to 0 means with wonderful examples so maybe just go back and have a read even the preview version is very very good I think we should probably talk about book recommendations also let me take one more question and then from twitter because he's been waiting or he or she's been waiting just to paraphrase a couple of times assuming somebody is talented in this and has established an interest in real science and he wants to partly learn the math while they are cut down what is the learning path okay so that's kind of what so learning path Sharon you should start from basics I mean like you should know about calculus linear algebra I mean cognitive distributions a little bit about OR as well the four things I would say linear algebra, calculus and probability distributions these are three good things to start with and gradually you can learn more about getting to the depth as you explore new algorithms from somebody who just practice the air or ML part of it I don't agree with saying you need to understand calculus because you probably need to know how derivatives work and that's why your back propagation algorithm is actually giving you good results but you need to have good understanding in linear algebra and maybe a bit of probability so I would recommend again learning style what works for you so if you were trying to learn let's say just linear algebra or even linear algebra if you like it very visually then there is this nice course of the essence of linear algebra and the essence of calculus actually by three blue, one brown on YouTube done very visually very visual in kind of the geometric explanation of how linear algebra works if you want to learn it much more in kind of a classroom studying then you would say gold, watch Gilbert Strang's MIT course on linear algebra because that's the kind of very ground way of learning if you want to look at much more on the hacker way to do it kind of building your own way then I don't think there is a hacker well we have a repo called Hacker Math but I think think stats think base Alan Downey is kind of the hacker way of understanding both the frequentist and the Bayesian approaches and also think Python in the third series but statistics for hacker so that's kind of again very kind of approach if you want to look at the process of data science then I recommend you know R for data science as the Hadley Wickham's book which is really the process of I think Julia Sales also has a recent type of text but the basic tabular data process R for data science and it's free as well all of the stuff that I recommend is free officially bring it to a close because you can continue to have it last couple of questions is it already passed? it's 8.45 it's on YouTube right nobody is metering it we also might have families who might want to go back home last couple of questions and then we'll wrap it up with a view from everyone else to digest and then we can continue last one he raised first this is kind of corollary to what she was asking what is an average mathematics and in my full college I only did 60 points and I'm a back bench here maybe the way math was taught is wrong there is one approach but if I am average in math is it even worth it or would I be limited somewhere that it's not even worth it tough question I think learning math is very important but how we learn is how you learn it is different and then without learning math as we all spoke you can only go to some extent and after that it is all black box so where are you headed to I mean first of all I have a business problem of my own so my you should know just you should know just enough to hire people or you should know somebody who can hire for you I think that's give it a try give it a business problem and try see if you can solve it I mean it won't take much time to figure out whether you can solve it or not if you can't then try to see how other people have solved it and is it available online and if you can learn that thing again if it takes a lot of time I would rather suggest hire someone and ask him to work you just understand what he is doing so that would be a fair enough answer so with that note maybe I don't know any last thoughts across if someone can have a job before we can have a family close Sharon last words of wisdom across one piece of nugget that you want everyone to take away yeah I mean I'll probably add to this same question so the problem specific that you have is kind of a resource problem you don't have any implementation yet in TensorFlow or any framework that could actually you can just take and use it so that's kind of a resource problem I don't think you will go to a great extent without knowing much math because at some point of time you will have to sit and understand what's exactly happening so basically you need to know math to go beyond the limit other side Uma your words will do so my words of wisdom are learning by doing I mean even math you have to do, you have to code to really understand at least it worked for me it was not just pen and paper in theory it was more of doing and secondly since I have a startup at home I think we are underestimating the business side of problem product side of problem so if you want to solve a business problem, math is going to be a very small component up to a few months years so don't worry about math you will maybe hire somebody and get it done when it comes to it I mean I would like to say find your mix I mean when I say mix there are three, four components of data science we are talking about maths and stats then comes the problem solving and then comes the domain and it's not that everyone can be very balanced in all the four things so you will have to find your best mix and it depends on where you want to land up the way people work data science team works in analytics servicing company like music, math, data is very different than the people work in captives versus maybe analytics consulting or people working consulting while taking help so you will have to figure out where you are headed to, what are your strengths and where do you want to spend time so that you can reach your destination that's what I would say I would say get started, get to work don't be afraid to ask questions just don't be scared of math or any subject start doing things and then you will always figure out something one last thing because I said don't be afraid to ask you will be surprised we have so many avenues like twitter email I will be surprised that even the top data scientist we send them mail with a very specific patient a lot of them will actually answer back but just make sure that you respect their time both on twitter or on email well on that note we will kind of formally end the session so we want a big round of applause for all our panelists taking all these questions tough questions on how to begin on this thanks to Hasgi for actually organizing this I think it's been very much the hacker way of doing research but it's been a useful session I know what an editor did but how do I describe an editor I just want to thank the speakers in a hacker way by handing over a t-shirt instead of a shawl so thank you Amit for talking too much Ishaan, thanks we can continue the conversation downstairs I think we are left with some snacks because the downstairs people are home most of the time and some drinks so feel free to stay back and chat on and do come back to our open houses right to us we are always very happy to hear what you would like to hear about which buzzword you would like us to crack next and also I think there are workshops that are happening on 5th hour Amit is running a reassigned bootcamp if you miss the board please figure out when the next board is going to start and I think Raghurth was unfortunately given up the teaching part but has been a great teacher himself I think what we should probably do is put a list together in terms of resources like the ones that we discussed so we are going to pull out whatever we can from the recording and maybe we will surprise I am happy to do that we will probably put this up on our blog post so that everybody can see it so we have one hour weeks trying to get this together thank you very much thank you hello hello thank you if I asked do you hold on to the door I thought you would hold on I was there but I said I will give you much time I hope it is done but I am showing most of the thoughts of the staff what do you think I know it is a very fun sorry okay there is a lot of things that are happening that is why it is happening they keep it all but you don't know see if you can