 At a recent Google Cloud conference, Robcraft, product need for cloud machine learning, got up on stage and told the crowd, onwards of 9 years ago, we got out of the rules business. Everyone in this room probably writes rules for a living, if you write code. If this, then that, those are rules. If the following things are met, the following things should execute. The stored procedure sees this, the stored procedure writes that. Those are all rule-based systems. What if you are able to declare through a statistical model, here's what good looks like, and the confidence that good is this thing, and why doesn't the system then determine on its own how it should determine to get to that good thing, that's what a predictive type of system tries to do. What he's describing here is the shift that's taken place over the past decade towards machine learning becoming an ever more popular method for building software systems. Machine learning has seen explosive growth over the past decade and applications within many different areas. For example, the machine learning algorithms on Yelp's website help the company's staff to compile, categorize and label images more effectively. Machine learning applications are being used at Facebook to filter out spam and poor quality content, and the company is also researching computer vision algorithms that can read images to visually impaired people. Bao Du's R&D lab uses machine learning to build what the company calls deep voice, a deep neural network that can generate entirely synthetic human voices that are very difficult to distinguish from genuine human speech. Machine learning refers to the process through which a computer can construct an algorithm based upon the analysis of data. Such algorithms overcome following strict static program instructions by making data-driven predictions or decisions by building a model based upon sample inputs. Machine learning is employed in a range of computer tasks where designing and programming explicit algorithms with good performance are difficult or infeasible. In such cases we tell the computer what we want the output to be and then it builds a model based upon the data that will be able to reproduce those results when presented with new data sources to process. Machine learning can largely be characterized as an optimization process over some set of data. To solve any machine learning problem we want to find a metric that tells us how far we are from the solution and try to minimize that value, minimize the error which is called the loss function. The formal definition of a machine learning system is stated as such. A computer program is said to learn from experience E with respect to some class of task T and performance P if its performance at task T as measured by P improves with experience E. Which essentially just means that the computer is given a task and some metric for success and with each iteration over the task it is performing it gets better at doing it as measured by the performance metric. For example Google used a machine learning algorithm to drastically reduce the electrical consumption in its data centers. Using a system of neural networks trained on different operating scenarios and parameters within their data centers they created an efficient and adaptive framework to understand data center dynamics and their optimization efficiency. They accomplished this by taking the historical data that had already been collected by thousands of sensors within the data center and using it to train a set of deep neural networks. The machine learning system analyzed the internal arrangement within the data center and tried different configurations to assess the efficiency of energy consumption. It stays iterating adjusting the configuration and trying to reduce the value and learning at each iteration. Ultimately the algorithm managed to reduce the amount of energy used for cooling by up to 40%. This is the key to most machine learning problems. You take the problem and minimize the error by using gradient descent, trying different options to see which reduces the output error by the most and then iterating on that process. Machine learning systems are typically categorized as being supervised or unsupervised. The biggest difference is that supervised learning deals with labeled data while unsupervised learning deals with unlabeled data. Labeled data is a group of samples that have been tagged with one or more labels. The process of labeling typically takes a set of unlabeled data and attempts to apply meaningful tags to that data that are informative of its content. For example those labels might indicate whether a photo contains a mountain or a lake or the type of action being performed in the video, what the topic of a news article is, what the overall sentiment of a tweet is. Labeling can be a time consuming exercise that is often done by humans. After obtaining a labeled data set, machine learning models can be applied to that data so that new unlabeled data can be presented to the model and a likely label can be guessed or predicted for the piece of unlabeled data automatically by the algorithm. Techniques that can work with unlabeled data are called unsupervised learning. With unsupervised learning, we're trying to get the machine to find and create different categories within the data. In an unsupervised approach, you're trying to build a prediction where you don't actually have the outcome as a reference for training the algorithm but we let the model work on its own to discover information that may not be visible to the human eye. Clustering algorithms are one such example where a set of inputs is to be divided into groups. This involves the analysis of patterns and sets of unlabeled data to find groups that are similar. Unsupervised learning is important because most of the time the data that we get in the real world doesn't have little tags attached to it to tell us what it is. We need to perform some kind of analysis before going any further. With supervised learning, the computer is presented with example inputs and their desired outputs given by a teacher and the goal is to learn a general rule that maps inputs to outputs. We do this by training the model, that is, we load the model with knowledge so that we can have it predict future instances. For example, we teach a model by training it with some data from a labeled data set and then provide it with new data for it to try and match the original labels. Main types of supervised learning are classification and regression. Spam filtering is an example of classification where the inputs are email messages and the classes are spam or not spam. Likewise, one might feed the system a data set of flowers to have it identify the different types. A core objective of machine learning systems is to generalize from their experience. Generalization in this context is the ability of a learning machine to perform accurately on new unseen examples or tasks after having experienced a learning data set in the past. The training examples come from some generally unknown probability distribution and the system has to build a general model about this space that enables it to produce sufficiently accurate predictions in new cases. The key aspect of machine learning that makes it an important method with respect to big data is that we don't have to hard code pre-specified rules. The iterative feedback aspect of machine learning is important because as models are exposed to new data they're able to independently adapt and evolve. They learn from previous computations to produce reliable and repeatable decisions and results. There are many different approaches to building machine learning systems. In the next module we'll give an overview to some of the primary approaches taken.