Skip to main content

Types of Machine Learning Systems

 There are so many different types of Machine Learning systems that it is useful to

classify them in broad categories based on:

• Whether or not they are trained with human supervision (supervised, unsuper‐

vised, semisupervised, and Reinforcement Learning)

• Whether or not they can learn incrementally on the fly (online versus batch

learning)

• Whether they work by simply comparing new data points to known data points,

or instead detect patterns in the training data and build a predictive model, much

like scientists do (instance-based versus model-based learning)

These criteria are not exclusive; you can combine them in any way you like. For

example, a state-of-the-art spam filter may learn on the fly using a deep neural network model trained using examples of spam and ham; this makes it an online, model-

based, supervised learning system.

Let’s look at each of these criteria a bit more closely.

Supervised/Unsupervised Learning

Machine Learning systems can be classified according to the amount and type of

supervision they get during training. There are four major categories: supervised

learning, unsupervised learning, semisupervised learning, and Reinforcement Learning.

Supervised learning

In supervised learning, the training data you feed to the algorithm includes the desired

solutions, called labels.


A typical supervised learning task is classification. The spam filter is a good example

of this: it is trained with many example emails along with their class (spam or ham),

and it must learn how to classify new emails.

Another typical task is to predict a target numeric value, such as the price of a car,

given a set of features (mileage, age, brand, etc.) called predictors. This sort of task is

called regression (Figure 1-6). 1 To train the system, you need to give it many examples

of cars, including both their predictors and their labels (i.e., their prices).


Note that some regression algorithms can be used for classification as well, and vice

versa. For example, Logistic Regression is commonly used for classification, as it can

output a value that corresponds to the probability of belonging to a given class (e.g.,

20% chance of being spam).

Here are some of the most important supervised learning algorithms (covered in this

book):

• k-Nearest Neighbors

• Linear Regression

• Logistic Regression

• Support Vector Machines (SVMs)

• Decision Trees and Random Forests

• Neural networks 2

Unsupervised learning

In unsupervised learning, as you might guess, the training data is unlabeled

(Figure 1-7). The system tries to learn without a teacher.


Here are some of the most important unsupervised learning algorithms (we will

cover dimensionality reduction in Chapter 8):

• Clustering

— k-Means

— Hierarchical Cluster Analysis (HCA)

— Expectation Maximization

• Visualization and dimensionality reduction

— Principal Component Analysis (PCA)

— Kernel PCA

— Locally-Linear Embedding (LLE)

— t-distributed Stochastic Neighbor Embedding (t-SNE)

• Association rule learning

— Apriori

— Eclat

For example, say you have a lot of data about your blog’s visitors. You may want to

run a clustering algorithm to try to detect groups of similar visitors (Figure 1-8). At

no point do you tell the algorithm which group a visitor belongs to: it finds those

connections without your help. For example, it might notice that 40% of your visitors

are males who love comic books and generally read your blog in the evening, while

20% are young sci-fi lovers who visit during the weekends, and so on. If you use a

hierarchical clustering algorithm, it may also subdivide each group into smaller

groups. This may help you target your posts for each group. 


Visualization algorithms are also good examples of unsupervised learning algorithms:

you feed them a lot of complex and unlabeled data, and they output a 2D or 3D rep‐

resentation of your data that can easily be plotted (Figure 1-9). These algorithms try

to preserve as much structure as they can (e.g., trying to keep separate clusters in the

input space from overlapping in the visualization), so you can understand how the

data is organized and perhaps identify unsuspected patterns.

Comments

Popular posts from this blog

A Bluffer’s Guide to AI, Algorithmics and Big Data

 Big Data—More Than “Big” A few years ago, the keyword big data resounded throughout the land. What is meant is the emergence and the analysis of huge amounts of data that is generated by the spreading of the Internet, social media, the increasing number of built-in sensors and the Internet of Things, etc. The phenomenon of large amounts of data is not new. Customer and credit card sensors at the point of sale, product identification via barcodes or RFID as well as the GPS positioning system have been producing large amounts of data for a long time. Likewise, the analysis of unstructured data, in the shape of business reports, e-mails, web form free texts or customer surveys, for example, is frequently part of internal analyses. Yet, what is new about the amounts of data falling under the term “big data” that has attracted so much attention recently? Of course, the amount of data avail- able through the Internet of Things (Industry 4.0), through mobile devices and social media has ...

Sales and Marketing Reloaded—Deep Learning Facilitates New Ways of Winning Customers and Markets

 Sales and Marketing 2017 “Data is the new oil” is a saying that is readily quoted today. Although this sentence still describes the current development well, it ides not get down to the real core of the matter; more suitable would be “artificial intelligence empowers a new economy”. The autonomous automation of ever larger fields of tasks in the business world will trigger fundamental economic and social changes. Based on a future world in which unlimited information is available on unlimited computers, ultimate decisions will be generated in real time and processes will be controlled objectively. These decisions are not liable to any subjectivity, information or delays. In many sectors of the economy, e.g. the public health sector or the autonomous control of vehicles, techniques of artificial intelligence (AI) are applied and increase the quality, availability and integrity of the services offered. The same development can be observed in the field of sales and marketing. Today, ...

How Bots Change Content Marketing

 When considering the future of content marketing, one aspect is of par- ticular significance that nobody who wishes to be successful in the long run should neglect: AI and bots will become game changers in a few years. Many of the former content strategies will be turned upside down by the new pos- sibilities and thus become a greater challenge to companies. Some experts thus speak of the death of the (former) content marketing by the AI algo- rithms. This is certainly an exaggeration, even if provided with a spark of truth. Content marketing itself is regarded as one of the most cost-effective mar- keting strategies that is asserting itself increasingly more worldwide. Even if it is not always easy to be visible on the Internet with one’s own content, one thing remains certain: Customers have a great need for information and want to be entertained. Despite the content shock, the best and most unique contents will always assert themselves somehow. If the demands on content change,...