Types of Machine Learning Systems

There are so many different types of Machine Learning systems that it is useful to

classify them in broad categories based on:

• Whether or not they are trained with human supervision (supervised, unsuper‐

vised, semisupervised, and Reinforcement Learning)

• Whether or not they can learn incrementally on the fly (online versus batch

learning)

• Whether they work by simply comparing new data points to known data points,

or instead detect patterns in the training data and build a predictive model, much

like scientists do (instance-based versus model-based learning)

These criteria are not exclusive; you can combine them in any way you like. For

example, a state-of-the-art spam filter may learn on the fly using a deep neural network model trained using examples of spam and ham; this makes it an online, model-

based, supervised learning system.

Let’s look at each of these criteria a bit more closely.

Supervised/Unsupervised Learning

Machine Learning systems can be classified according to the amount and type of

supervision they get during training. There are four major categories: supervised

learning, unsupervised learning, semisupervised learning, and Reinforcement Learning.

Supervised learning

In supervised learning, the training data you feed to the algorithm includes the desired

solutions, called labels.

A typical supervised learning task is classification. The spam filter is a good example

of this: it is trained with many example emails along with their class (spam or ham),

and it must learn how to classify new emails.

Another typical task is to predict a target numeric value, such as the price of a car,

given a set of features (mileage, age, brand, etc.) called predictors. This sort of task is

called regression (Figure 1-6). 1 To train the system, you need to give it many examples

of cars, including both their predictors and their labels (i.e., their prices).

Note that some regression algorithms can be used for classification as well, and vice

versa. For example, Logistic Regression is commonly used for classification, as it can

output a value that corresponds to the probability of belonging to a given class (e.g.,

20% chance of being spam).

Here are some of the most important supervised learning algorithms (covered in this

book):

• k-Nearest Neighbors

• Linear Regression

• Logistic Regression

• Support Vector Machines (SVMs)

• Decision Trees and Random Forests

• Neural networks 2

Unsupervised learning

In unsupervised learning, as you might guess, the training data is unlabeled

(Figure 1-7). The system tries to learn without a teacher.

Here are some of the most important unsupervised learning algorithms (we will

cover dimensionality reduction in Chapter 8):

• Clustering

— k-Means

— Hierarchical Cluster Analysis (HCA)

— Expectation Maximization

• Visualization and dimensionality reduction

— Principal Component Analysis (PCA)

— Kernel PCA

— Locally-Linear Embedding (LLE)

— t-distributed Stochastic Neighbor Embedding (t-SNE)

• Association rule learning

— Apriori

— Eclat

For example, say you have a lot of data about your blog’s visitors. You may want to

run a clustering algorithm to try to detect groups of similar visitors (Figure 1-8). At

no point do you tell the algorithm which group a visitor belongs to: it finds those

connections without your help. For example, it might notice that 40% of your visitors

are males who love comic books and generally read your blog in the evening, while

20% are young sci-fi lovers who visit during the weekends, and so on. If you use a

hierarchical clustering algorithm, it may also subdivide each group into smaller

groups. This may help you target your posts for each group.

Visualization algorithms are also good examples of unsupervised learning algorithms:

you feed them a lot of complex and unlabeled data, and they output a 2D or 3D rep‐

resentation of your data that can easily be plotted (Figure 1-9). These algorithms try

to preserve as much structure as they can (e.g., trying to keep separate clusters in the

input space from overlapping in the visualization), so you can understand how the

data is organized and perhaps identify unsuspected patterns.

Tech & Fun

Search This Blog

Types of Machine Learning Systems

Comments

Post a Comment

Popular posts from this blog

Customer Engagement with Chatbots and Collaboration Bots: Methods, Chances and Risks of the Use of Bots in Service and Marketing

Robot Journalism Is Becoming Creative

Sales and Marketing Reloaded—Deep Learning Facilitates New Ways of Winning Customers and Markets