Machine learning is about making predictions. This article will give an introduction to machine learning through the problem that most business enterprises face: forecasting customer churn. ML is able to predict which of your clients are at risk of leaving, and give you the opportunity to take all necessary steps to prevent this.
Introduction
Machine learning is best understood by looking at it from
various angles.
ü
Broad: machine learning is a prediction process,
usually based on the past.
ü
Practical: Machine learning tries to find
relationships in data to help predict what comes next.
ü
Technical: machine learning uses statistical
methods to predict the value of a target variable using a set of input data.
ü
Math: Machine learning tries to predict the
value of the variable Y given the input of set X.
Machine learning allows you to accurately predict using
simple statistical methods, algorithms, and modern computing power.
Outflow Example: Machine Learning helps us understand why
customers will leave and when. Our input values may include:
ü
How often do users interact with the product?
ü
What holds them back?
ü
How are users connected to each other?
We assume that this data will say something important about
the client.
Data / Algorithm
Any type of machine learning can be divided into two main
parts:
ü
Data
ü
Algorithm
Any other complications you may hear — deep learning,
gradient descent reinforced learning — are just variations on these two
fundamental parts. If you ever get confused in all of these terms, just ask
yourself if this has anything to do with your data or your algorithm.
Everything else is husk.
Your Data
The part about data in machine learning is the simplest.
Data is all that you are trying to predict, and how you plan to train the
computer about it.
Outflow example: data can be past user activity. Data is
usually organized as rows with columns representing their features (diagram
above).
Machine learning methods will try to find patterns in this
data. We could find several different types of relationships:
ü
Users with a large interval between orders are
likely to leave soon.
ü
Users with a large number of orders are unlikely
to leave
Any number of relationships can exist in the data. Machines
only need this data they will process it and highlight the relationship.
Your Algorithm
In machine learning, an algorithm is just a method that you
will use to search for relationships in your data. Algorithms can be complex or
simple, large or small, but, in fact, these are just ways to figure out what
drives the changes you are trying to predict. Have you heard of deep learning?
This is a (mostly) type of algorithm. Just as the Merge Sort algorithm is
effective in sorting arrays, machine learning algorithms are effective in
identifying relationships and associations.
Different types of algorithms will help you achieve
different goals. If you want to explain relationships in human communication,
it might be worth using a simple algorithm such as linear regression. If you
are most interested in accuracy (and explainability is not too important),
neural networks can achieve higher accuracy rates. This is often called a
compromise between accuracy and explainability, and it is a serious choice that
many data experts have to make.
Everything else that you encounter in the ML world is
related to one of these two: data or an algorithm. Normalization of properties?
Change your data. Deep learning? Type of
algorithm. Cross check? A way to improve your algorithm.