Random forest

Decision Tree

In this post we are going to manage a Classification problem, by using some CART models (Classification And Regression Trees).

We will use the following Bank Marketing Data Set dataset, provided by the UCI Machine Learning Repository:
ref. [Moro et al., 2014] S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014

These are the results about some direct marketing campaigns carried out by a Portuguese bank by using outbound contact center calls, to try to sell repo financial products to customers.
The labeled output data we are interested in predicting are “binary” (column y): “yes” in the event that customers have accepted the bank deposit offer or “no” if the offer has been rejected.

Let’s import some useful libraries with scikit-learn:

Continue reading “Random forest”

Linear Regression

Linear Regression AI

In this first post we will play a bit with Linear Regression in order to get confidence with some key concepts about machine learning.

Deep Learning Convolutional Neural Network, Recurring Neural Network, Support Vector Machine, Logistic Regression are great techniques for complex prediction, even the non-linear ones.

However Linear Regression is a great way to start when you have to perform prediction about data generally linearly correlated data.

Sport Blood Test dataset

Let’s consider the Australian athletes data set: a nice dataset collected in a study of how data on various characteristics of the blood varied with sport body size and sex of the athlete. These data were the basis for the analyses reported in Telford and Cunningham (1991).

Anybody interested in knowing more about that study can reference to the Telford, R.D. and Cunningham, R.B. 1991. Sex, sport and body-size dependency of hematology in highly trained athletes. Medicine and Science in Sports and Exercise 23: 788-794: https://europepmc.org/article/med/1921671

We are going to use Pyhton with Jupyter Notebook for such a model.

Let’s import some useful libraries to start:

Continue reading “Linear Regression”