Machine learning

# Machine learning
### <a href='https://cdsbasel.github.io/dataanalytics/'> Data Analytics for Psychology and Business </a> <a href='https://cdsbasel.github.io/dataanalytics/menu/materials.html'> </a>  <a href='https://cdsbasel.github.io/dataanalytics/'> </a>  <a href='mailto:rui.mata@unibas.ch'> 
### April 2019

---

<div class="my-footer">
 
 
 <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/by-sa.png" height=14 style="vertical-align: middle"/>
 
 <a href="https://cdsbasel.github.io/dataanalytics/">
 
 
 cdsbasel.github.io/dataanalytics/
 
 
 </a>
 <a href="https://cdsbasel.github.io/dataanalytics/">
 
 Data Analytics for Psychology and Business | April 2019
 
 </a>
 
 </div>

---

# What do you think?

No Googling :)

---

# What is machine learning?

Machine learning is...

...a <high>field of artificial intelligence</high>...

...that uses <high>statistical techniques</high>...

...to allow computer systems to <high>"learn"</high>,...

...i.e., to progressively <high>improve performance</high> on a specific task...

...from small or large amounts of <high>data</high>,...

....<high>without being explicitly programmed</high>....

....with the goal to <high>discover structure</high> or </high>improve decision making and predictions</high>.

]

<img src="image/ml_robot.jpg" height=380px> 
from <a href="https://medium.com/@dkwok94/machine-learning-for-my-grandma-ca242e97ef62">medium.com</a>

]

---

# Easy to confuse

AI is <high>intelligence demonstrated by machines</high>, in contrast to the natural intelligence displayed by humans and animals.

Statistics is a <high>branch of mathematics</high> dealing with data collection, organization, analysis, interpretation and presentation.

Big Data deals with data sets that are <high>too large or complex</high> to be dealt with by traditional data-processing application software.

Data Science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to <high>extract knowledge and insights</high> from structured and unstructured data.

]

]

---

# Types of machine learning tasks

There are many types of machine learning tasks, each of which call for different models.

<high>We will focus on supervised machine learning</high>.

]

<img src="image/mltypes.png" height=500px> 
from <a href="image/mltypes.png">amazonaws.com</a>

]

---

# Data terminology

<tr>
 <td bgcolor="white">
 Term
 </td>
 <td bgcolor="white">
 Definition
 </td> 
 <td bgcolor="white">
 Example
 </td> 
</tr>
<tr>
 <td bgcolor="white">
 Case
 </td>
 <td bgcolor="white">
 A specific <high>observation</high> of data.
 </td> 
 <td bgcolor="white">
 A patient, a site, etc.
 </td> 
</tr>
<tr>
 <td bgcolor="white">
 Feature
 </td>
 <td bgcolor="white">
 An measurable <high>property</high> of cases. Also called predictors. 
 </td> 
 <td bgcolor="white">
 Age, temperature, country, etc.
 </td> 
</tr>
<tr>
 <td bgcolor="white">
 Criterion
 </td>
 <td bgcolor="white">
 The <high>feature</high> that you want to <high>predict</high>.
 </td> 
 <td bgcolor="white">
 Heart attack, sales, etc.
 </td> 
</tr>
<tr>
 <td bgcolor="white">
 Data
 </td>
 <td bgcolor="white">
 Typically <high>rectangular</high> representation of cases (rows) and features (columns).
 </td> 
 <td bgcolor="white">
 <mono>.csv</mono>, <mono>.xls</mono>, <mono>.sav</mono>, etc.
 </td> 
</tr>
</table>

]

]

---

# Supervised learning

The <high>dominant type</high> of machine learning.

Supervised learning uses <high>labeled data</high> to learn <high>a model</high> that relates the criterion to the features.

Verbal model

<mono>if cp (chest pain) is not a (asymptomatic) and age is larger than 60 then high probability of hearth attack, otherwise low probability.</mono>

]

<img src="image/supervised.png"> 

]

---

# 3 key (supervised) models

---

# 2 types of supervised problems

There are two types of supervised learning problems that can often be approached using the same model.

Regression

Regression problems involve the <high>prediction of a quantitative feature</high>.

E.g., predicting the cholesterol level as a function of age.

Classification

Classification problems involve the <high>prediction of a categorical feature</high>.

E.g., predicting the origin of chest pain as a function of age and heart attack risk.

]

]

---

# Unsupervised learning

Analyzes the relationships among cases (<high>clustering</high>) or among features (<high>dimensionality reduction</high>) to <high>discover structures</high> such as groups or meta-features.

<tr>
 <td bgcolor="white">
 Approach
 </td>
 <td bgcolor="white">
 Description
 </td> 
 <td bgcolor="white">
 Example
 </td> 
</tr>
<tr>
 <td bgcolor="white">
 Clustering
 </td>
 <td bgcolor="white">
 Analyze distances between cases to identify <high>clusters of homogeneous cases</high>.
 </td> 
 <td bgcolor="white">
 Types of customers or patients.
 </td> 
</tr>
<tr>
 <td bgcolor="white">
 Dimension- ality reduction
 </td>
 <td bgcolor="white">
 Analyze correlations between features to identify <high>higher order features</high>. 
 </td> 
 <td bgcolor="white">
 Dimensions of personality or user experience.
 </td> 
</tr>
</table>

]

]

---

# Reinforcement learning

<high>Learns iteratively</high> from minimal supervision provided by <high>performance feedback</high>.

RL is closely <high>related to psychological theories of learning</high>.

Examples

<table style="cellspacing:0; cellpadding:0; border:none;">
 <col width="30%">
 <col width="70%">
<tr>
 <td bgcolor="white">
 Application
 </td>
 <td bgcolor="white">
 Description
 </td> 
</tr>
<tr>
 <td bgcolor="white">
 Model fitting
 </td>
 <td bgcolor="white">
 Iteratively <high>change model parameters</high> to improve prediction. 
</tr>
<tr>
 <td bgcolor="white">
 Robot movements
 </td>
 <td bgcolor="white">
 Iteratively <high>change movement</high> patterns to increase pancake-catch probability. 
</tr>
<tr>
 <td bgcolor="white">
 Games
 </td>
 <td bgcolor="white">
 Iteratively <high>change controller input</high> patterns to improve Mario Kart racing time. 
</tr>
</table>

]

<img src="image/roboarm.gif" width=320px> 
from <a href="https://giphy.com/explore/reinforcement-learning">giphy.com</a>

<img src="image/mariokart.gif" width=320px> 
from <a href="https://blogs.nvidia.com/blog/2017/04/14/tensorkart-ai-mario-kart/">nvidia.com</a>

]

---

# Reinforcement learning

<high>Learns iteratively</high> from minimal supervision provided by <high>performance feedback</high>.

RL is closely <high>related to psychological theories of learning</high>.

Examples

]

]

---

# Machine learning is more than algorithms

<img src="image/mlsteps.png" height=440px> 
from <a href="https://www.houseofbots.com/images/news/11493/cover.png">houseofbots.com</a>

---
class: middle, center

<h1><a href=https://cdsbasel.github.io/dataanalytics/menu/materials.html>Materials</a></h1>