 Hi everyone, I am Santosh, I am coming from Hyderabad, the basically for this session. So before even jumping on to this one, I am sure you might be engaging with a couple of analytics, machine learning, deep learning, training data, etcetera, etcetera from the last couple of days or rather even in your experience. I want to give you all together different perspective of what is analytics. See my view of analytics or artificial intelligence or machine learning, it is an another revolution that would happen in the next 10 to 12 years. But prior to that what has happened is in that there is no analytics prior to that that is an industrial evolution right, based basically a person who is writing a ledger has transformed it into a digital way, largely using Excel. So, an earlier earlier there were devices, analytical devices, but through this session I want to share you something, a spreadsheet can become intelligent. What does it mean? That is what the next generation we are trying to do, how can I make a table intelligent, how can I make an light intelligent, nothing but rather than just they acting as a dumb devices, can they give some insights about themselves or surrounding it. So, I am currently a student from ISB, currently a student, I am a current student from ISB and currently working with Signity as a business process consultant, so largely focusing on the operational analytics. So, near future I will be converting into a business analytics. So, through this session I have two objectives which would be key takeaways for you guys. First, improving reporting the past, so can somebody tell me what do you mean by reporting the past, analyzing historical data, analyzing in the second part. So, I am just talking about the reporting part means, so how do we report the historical data? We all know right a visual is thousand times or million times better than the data. So, I will give you all together a different perspective of talking about the historical data, not just charts, not just tables. And the next objective is enabling predicting the future, I am taking one step, I am taking one step to make Excel intelligent, Excel here I am using it as a spreadsheet which is applicable to any spreadsheets. So, let us go one by one. So, pre-questions, how many of us have Excel on our laptops and desktops, I am almost everybody and most of the times you may ask me why do we need to choose even Excel when there is R, Python lot of things are there right. So, how many of you here are more or less 7 to 8 years work experience in IT industry, most of them, but so when the experience grows and I am talking only about IT guys, but there are some for example, my dad is working in a business from the last 20 years, 30 years that mean that he is not taking any decisions, yes he is taking decisions, coding should not be a battle thing for him to take certain decisions. So, how many of us open Excel at least once in a day, almost everybody, so rather I want to put it as in an R. In the next one, how many of us contribute or collect the data, either I will give the data to my manager or some superior or some business or business will collect the data. So, everybody is playing around the data that is what the entire thing about and next can we eliminate Excel or replace with some other softwares in the era of big data R programming. Any thoughts, yes how many can we eliminate Excel in the next 10 years, no, so we have to continue using Excel. So, what can we do, so can we eliminate mobile phone in the next 10 years, no, so what can we do beyond the current usage of things? I have to make mobile intelligent, I have to make spectacles intelligent, I have to make even software intelligent one that one of that step is Excel. The last point which I want to share is, can we extend the capabilities of Excel from business intelligence which is nothing but reporting the past to machine to machine learning and enable artificial intelligence, so that is where I want to heading it to, so I am sure you might be having, you might have seen this slide, so this is a slide which one of our professor shared to us, so reporting the past is nothing but business intelligence, predicting the future is nothing but machine learning, taking one decision, I repeat taking one decision from the machine learning means for example I tell you that from the history, let us say from the last 3-4 days, I collect the data of feedback from individuals that is reporting the past, so what do I need to predict, tomorrow one more person enters, I can predict what could be his feedback and what do I need to do with that, I can take one action on that, that is data science, then what is artificial intelligence which is the heading of overall conference and lot of things, artificial intelligence is taking the best out of multiple decisions, that is what learning the rules and taking one step, one best step among the rules, so far if you see the spreadsheets are working on this area only, so my intent is to take it to the next level, give a little insight on how can we take it to this level, let us go ahead, yeah and then further and further, short term and in the long term that can happen, so it is only computational challenge which you will have and the key phases of analytics, I just want to run through quickly couple of slides I will take and handle, so key phases in analytics basically goal setting, data extraction, data exploration could be visualization whatever it could be, data analysis and modeling, so without any of these things I am sure we are working more on this one, I am sure every one of us has done at least one or two graphs in our entire experience, so we are largely dependent on somebody has given the data and I am exploring it and do some analysis, maximum analysis which we will do is inference out of the charts, so I will take little bit beyond that, so my focus for this session is third phase and the last phase, so past is nothing but entire top row is past and future is the next thing, so there are couple and here onwards we will see couple of spreadsheets, couple of spreadsheets which will be working like what you call it as edit distance or etcetera, so let me show you one important, so as you can see here there are 1, 2, 3, 4, 5 phases, so at every phase how excel will be made intelligent is what I will explain you quickly and here I want to show you some important visualization, before that see this is a video which is generated out of excel and the data is across the globe earthquakes, nuclear explosions all those things, so this is we can think of this as another way of exploring showing the data without using tab no or python or whatever it could be, I will show you another type of visualization as well, how many have you heard about the name called Hans Rosling, Hans Rosling heard about it right, there are couple of Ted videos on him who actually presents, who actually presents data in a time series, for example imagine if there is a time series what do you think is the best chart to represent, line chart on the x axis you will have time and y axis you will have the data, but he tries to explain time chart as a cross sectional, now if you see the screen it is 2014, next 2015 it is animation kind of thing, so let me show you that in excel again, see this is the I am not sure if you are able to see the x axis, no right ok, x axis is the fertility rate and y axis is the life expectancy and each circle represents one country and the size of the circle is nothing but the population, you might have seen this data set, but if you see in a general terms we cannot take 1, 2, 3 in a particular snapshot, but if I have another dimension I have no option that I have to take another filter, so now if you can think about it like this, this is another way we can think of adding another dimension, so now it will for every section it will try to show for the individual as if there is an animation going on, so adding more dimensions can we using VBA we can add more dimensions, we can think in that direction, that is what I guess most of you are familiar with Tableau, we have a playback in Tableau, so that kind of thing and this is about the visualization part which I want to share and how many have you heard about something called as Levenstein edit distance, what do you think is great, so for example may I know your name sir Arun, so what is the distance between Santosh name and Arun name as a keyword, what is the distance between Santosh and maybe some other Ramcharan, some different, so how do we calculate, so this guy has developed a methodology to calculate the distance between two different keywords based on replacement, deletion or updation, so what I try to do is I try to just include that as part of spreadsheet, as part of my exercise, so there is an input called mushroom and I have multiple options, so maybe I will put something as Bangalore, so once you click on suggest it will tell me which is there is a tie here, so it tells two of the things are having equal rating here and here, maybe I will give you some other name Arun, so in the behind the scenes I am not doing anything else but running this algorithm which is nothing but the concept which he has written like it should start from left compare each two characters and through some navigation we can get to the score of the score between any two words and I am trying to take this one to the next level by comparing two different strings not two words but two different strings, so that can also be done in terms of similarity distance, so one very important point which I just thought to share you is see machine learning prediction classification everything is nothing but an optimization problem which has an objective whether I want to increase the variance, reduce the variance, increase the mean, reduce the mean or whatever increase the accuracy by controlling some parameters could be hyper parameter could be complexity parameter whatever it could be and I have my own constraints, so in that way in that way if I want to if I want to just compare two different strings here there are two strings there is nothing in this room I need to eat nothing is the first nothing is the first thing in this room, so through this quick small I mean I am not saying this is an NLP level of thing but as the basic level an individual can just create these sort of things to understand the initial analytics of the data he need not I am not telling that we have to we should not think about NLP that is a long term for a person see we are we are trying to understand lot of things in this conference but once we are at our own desktops again the first thing which we see is a spreadsheet isn't it, so there where I am coming from, so if you see there is a similarity of 21 percent between these two strings based on the words, so that is again calculated based on how many are there is a kind of a lift kind of how many are common as simple as set A set B which are intersection divided by the union as simple as that, so these are some small small things which you can think of doing in spreadsheet, so largely I am talking about there is also some methodology where you can do a web scraping in Excel, so I do not want to get into that now data visualization I have shown you a video and this one and there is some important thing which I want to share here dynamic charts, so what comes to your mind when you think of a dynamic chart sir can you just let me know what comes to your mind when you think of a dynamic chart yeah right there is an English meaning of dynamic something that is continuously getting nothing but agile kind of, so for example here if the same data set again here I can change this this gets changed I can change this I can change this that level is nothing but from the basic excel chart to I can take to dynamic chart, but I want to show you something different here earthquake is 1935.75 maximum in March minimum in June so some basic inferences can be done by us because when we dig into analytics we will will be lost of basic stuff, so some basic things which we put rules for example currently whatever charts whatever things we do the exploratory data analysis should be done by human, so when we put these basic rules in the program itself the inference can be done by machine, so that is what my thought and last most important thing which I want to share you is a couple of things which I have done as part of predicting the so far you are talking about earthquake data all those things are past data there is nothing to do with the prediction, but these are our market basket analysis co-occurrence analysis all those things are nothing but as you see as a as a packages in python or r, so let me quickly show you neural networks and then if time permits I will open the other files, so I hope or more or less I don't want to repeat again more or less largely on everybody know about MNIST data right, we have a 28 by 28 matrix for every digit and we want to predict whether it is 0 1 2 3 4 whatever it could be, now again just a quick insight what is neural network you might be aware about it, so neural network is see I remember I recollect my professor statement there are three different kinds of people in analytics one person who is deadly scientific statistical in nature other person who is probabilistic in in nature and the third guy is who is network oriented guy, so any algorithm which you think of either it could be one among these three only, so the third part category comes into the picture when it is neural network and the similarly we can have that is the reason we use MNIST dataset in all the three categories through linear regression logistic regression and the second one would be probability by Bayesian al-classifiers or the last thing neural networks, so I quickly explain this with you this to you how we try to do this and I inspired by one guy who has actually shown how to create this in youtube, but I try to I try to take the approach I try to take the technicalities but not the approach because I am aware about the approach, so here as I say here the dataset is hardly 90 or 95 mixture of 0s, 9s, 1s and etcetera and the processing here is see here I just converted the pixel to a number between 0 minus 1 and 1 using a sigmoid function and these are my one of only one intermediate layer having 20 nodes and these 20 nodes these 20 nodes feeds in final data to the last 10 digits because I may have 0, 1 etcetera etcetera, so the probability here explains me what kind of a digit it is. So, basically 784 columns reduce to 20 different nodes neurons and then 10, so now you can now I will show you exactly how exactly machine learns, so for example there are around 107 data points I take a random data point maybe 39 it has given 5 right there are two things which I have done in the background the back propagation of neural networks I have put it in a VBA program whenever I run that once twice thrice four times based on that it will, so if I just put click on control L I will just count it control 1, 2, 3, 4, 5, 6, so it is not working, so after sometime it could able to detect, now if I put a different number it will not take this much time it will take less time you can observe that, so sorry this has taken more time see here the point it is not about the time again the point here what I meant is I want to show you how learning happens at the start during the progress of the time it will definitely find what it is that is what I mean. So, after taking multiple when I put this again in loop after taking multiple, multiple points then I can stop it there and put a testing data set here the main reason why I have shown this one to you is we can perform a basic machine learning algorithms in excel it could be a neural networks it could be a K-means clustering etcetera, etcetera and there is one thing which I want to share here the last one co-occurrence analysis which is very important thing I feel and then I will summarize, so imagine these 10 are neurons, so there is an experiment which was done then I mean rats have been given some different kinds of ingestion and they try to observe how they react which two neurons react together for example let us say it is very hot what are the two neurons of a human will trigger more as compared to all the all hundreds and thousands of neurons. So, there where I try to put the random numbers. So, if you see here this is nothing but each interval for first interval these neurons are hitting together in this interval these neurons are hitting together. So, using a concept called lift which are coming together as compared to individual I try to create some so when I just do something like this. So, those are random numbers see here. So, whenever I refresh a new data set comes into the picture and if you see here these two neurons are at the top these two are hitting together as compared to this one. So, I can focus on these two neurons and get into the feature extraction ok these two neurons are coming in together why do not I take these two as my features in that direction ok and can I take a minute to show market basket analysis. So, again there is positives and negative sides to every algorithm. So, basically market basket analysis as most of you know is just quick summary which two products are bought together. So, the biggest challenge here is as a person if I am going to a market or a e-commerce I may buy a laptop I may buy a chocolate. So, does not mean that so, I have a context in mind. So, I need to bring out that context. So, here the intent is not to that level, but to just validate which two products are bought together. So, imagine this is an invoice or a couple of bills and where he bought A, B, C, D some different products. So, there is a little overlap as well. So, let me just run this and show it to you. So, I take this particular data around 12 points and tell that this is my data set and just it tells see I forgot very important thing. As a person who is analyzing the market basket market data I need to tell the very critical parameter which is called threshold. Threshold is the hyper parameter in market basket analysis. If at all the score of the two different products is less than this I do not want to take it to the next level. So, that goes into a different topic, but I do not want to get into that. If you see here when you take a single product at a time the score of all these things when F has been bought more none of these are bought less than three times. So, there where everything is good to go to the next level. 1 hogya go to the 2 make sense when it is 2 again go to the 3 and when it goes to 2 A B is 3 A D is 3 and E F is 6 F G is 6, but all other things are less than 0 means these can go to the next set again. I need to compare these three take for example, A B D I can validate A D F something like this. So, then I can gradually accumulate which all the things are getting contribution more make sense right. So, let me summarize at the last. So, this I have shown to you. So, this is what we can even think of like exploratory data analysis definitely takes enough amount of time is not it. So, there should be a mechanism even to not automate, but even to define the rules over there. So, that because as a data scientist when we dig into the data analytics part we will forget the basics of it. So, when we put rules on it, it will automatically evaluate it. If Excel can do all these things I have not written the list just want to hear from you guys. One example which I want to share you is how many have you have you use some function if function in Excel great. Tomorrow I want to use similarity function in Excel. Tomorrow I want to use k means open the bracket give the range there should be a model coming out your understanding where I am coming from right. So, that is what the vision which I have to do for Excel and the goal is analytics for a common businessman. It is not you me it is a business my dad or somebody who is just running a cab or somebody an individual business owner. The last but not the least tougher the problem higher the usage of processes brain the higher the complicated problem we solve the more we try to do. And these are all list which I thought I would share you like starting with learning with the basics concepts and then extending to these things. So, that is it which I want to share and lot of things I have will I will try to share it in some other placeholder. Yeah that is what. So, I will share a couple of files which I have and yeah the last important thing any questions. Some things require VBA script some things does not. For example, I will show you one dashboard exactly that is what they are that is what they are trying to do. In the other way yeah please go ahead. Yes, yes, yes, yes, VBA for let me tell you like this how about macros right. So, for example, can I anything that which one I mean what all that when we think of automatic some automating something what is the first thing which should be taken care should that be I am talking about an activity an activity should be automated what is the most important thing about the activity you should think of 100 percent. So, macros is something which will be repeated dashboards will you repeat it no. So, what will you repeat learning of the algorithm because the more I repeat it the more I learn. So, in that direction macros will be implemented along with that if we use optimized formulas embedded in excel like sentiment open the bracket take some text and it should give the score and the extent which I am thinking of is I will use my cursor and select this it should give the sentiment score of tougher in the context of this presentation it can be done I am not telling it cannot be right it can be done I am not a programmer at all I am not at all a programmer I am hardcore business consultant I try to understand the business process record the macro for the first time understand the code change the constant to variables some places I use not all the slides wherever there are static things whenever I click the button right that is the VBA and you do not believe me the one which I just clicked on to to speak right just three lines and even I am not I just VBA just it is like as good as downloading some package, but the point where I am coming from is like joy told me earlier when we run when we run when we run algorithms using R python we depend on the packages code written by somebody and I put take decisions business decision based on the guy who has written further so so we have to create our own algorithm that can be done I am not telling which cannot be done in R and python, but that is one way we can learn second thing is your decisions will be in your control not in the control tomorrow with that guy will tell I did some mistake who will be at loss again that does not happen, but I am just telling you any questions yeah that is what right when I ok neural actually the problem got solved in the morning so I did not put the button over there I just did control L that is all yeah it is VBA yeah I try to type in that no no no even no 90 percent of that one neural network is not using VBA it is just I do not want to say it as VBA at all it is just a mathematics right in the neural you mean in neural network no no that is just to have one one at a time I have 20 data points 20 data rows send me one at a time so that I will learn but again I repeat in that neural network I have not used macros thank you so much