class: center, middle, inverse, title-slide # Machine learning ###
Data Analytics for Psychology and Business
### April 2019 --- layout: true <div class="my-footer"> <span style="text-align:center"> <span> <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/by-sa.png" height=14 style="vertical-align: middle"/> </span> <a href="https://cdsbasel.github.io/dataanalytics/"> <span style="padding-left:82px"> <font color="#7E7E7E"> cdsbasel.github.io/dataanalytics/ </font> </span> </a> <a href="https://cdsbasel.github.io/dataanalytics/"> <font color="#7E7E7E"> Data Analytics for Psychology and Business | April 2019 </font> </a> </span> </div> --- class: middle, center # What do you think? No Googling :) --- # What is machine learning? .pull-left45[ <b>Machine learning is</b>... <p style="padding-left:20px"> ...a <high>field of artificial intelligence</high>...<br><br> ...that uses <high>statistical techniques</high>... <br><br> ...to allow computer systems to <high>"learn"</high>,...<br><br> ...i.e., to progressively <high>improve performance</high> on a specific task...<br><br> ...from small or large amounts of <high>data</high>,... <br><br> ....<high>without being explicitly programmed</high>....<br><br> ....with the goal to <high>discover structure</high> or </high>improve decision making and predictions</high>. </p> ] .pull-right45[ <p align = "center"> <img src="image/ml_robot.jpg" height=380px><br> <font style="font-size:10px">from <a href="https://medium.com/@dkwok94/machine-learning-for-my-grandma-ca242e97ef62">medium.com</a></font> </p> ] --- # Easy to confuse .pull-left45[ <font style="font-size:24px"><b>AI</b></font> is <high>intelligence demonstrated by machines</high>, in contrast to the natural intelligence displayed by humans and animals. <font style="font-size:24px"><b>Statistics</b></font> is a <high>branch of mathematics</high> dealing with data collection, organization, analysis, interpretation and presentation. <font style="font-size:24px"><b>Big Data</b></font> deals with data sets that are <high>too large or complex</high> to be dealt with by traditional data-processing application software. <font style="font-size:24px"><b>Data Science</b></font> is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to <high>extract knowledge and insights</high> from structured and unstructured data. ] .pull-right45[ <p align = "center"> <img src="image/areas.png" height=390px><br> </p> ] --- .pull-left3[ # Types of machine learning tasks There are many types of machine learning tasks, each of which call for different models. <high>We will focus on supervised machine learning</high>. ] .pull-right65[ <br><br> <p align = "center"> <img src="image/mltypes.png" height=500px><br> <font style="font-size:10px">from <a href="image/mltypes.png">amazonaws.com</a></font> </p> ] --- # Data terminology .pull-left5[ <p> <table style="cellspacing:0; cellpadding:0; border:none; padding-top:10px"> <tr> <td bgcolor="white"> <b>Term</b> </td> <td bgcolor="white"> <b>Definition</b> </td> <td bgcolor="white"> <b>Example</b> </td> </tr> <tr> <td bgcolor="white"> <i>Case</i> </td> <td bgcolor="white"> A specific <high>observation</high> of data. </td> <td bgcolor="white"> A patient, a site, etc. </td> </tr> <tr> <td bgcolor="white"> <i>Feature</i> </td> <td bgcolor="white"> An measurable <high>property</high> of cases. Also called predictors. </td> <td bgcolor="white"> Age, temperature, country, etc. </td> </tr> <tr> <td bgcolor="white"> <i>Criterion</i> </td> <td bgcolor="white"> The <high>feature</high> that you want to <high>predict</high>. </td> <td bgcolor="white"> Heart attack, sales, etc. </td> </tr> <tr> <td bgcolor="white"> <i>Data</i> </td> <td bgcolor="white"> Typically <high>rectangular</high> representation of cases (rows) and features (columns). </td> <td bgcolor="white"> <mono>.csv</mono>, <mono>.xls</mono>, <mono>.sav</mono>, etc. </td> </tr> </table> </p> ] .pull-right4[ <p align = "center"> <img src="image/terminology.png"><br> </p> ] --- # Supervised learning .pull-left5[ The <high>dominant type</high> of machine learning. Supervised learning uses <high>labeled data</high> to learn <high>a model</high> that relates the criterion to the features. <br> <u>Verbal model</u> <font style="font-size:24px"><mono>if cp (chest pain) is not a (asymptomatic) and age is larger than 60 then high probability of hearth attack, otherwise low probability.</mono></font> ] .pull-right4[ <p align = "center"> <img src="image/supervised.png"><br> </p> ] --- # 3 key (supervised) models <p align = "center" style="padding-top:20px"> <img src="image/models.png"><br> </p> --- # 2 types of supervised problems .pull-left5[ There are two types of supervised learning problems that can often be approached using the same model. <font style="font-size:24px"><b>Regression</b></font> Regression problems involve the <high>prediction of a quantitative feature</high>. E.g., predicting the cholesterol level as a function of age. <font style="font-size:24px"><b>Classification</b></font> Classification problems involve the <high>prediction of a categorical feature</high>. E.g., predicting the origin of chest pain as a function of age and heart attack risk. ] .pull-right4[ <p align = "center"> <img src="image/twotypes.png" height=440px><br> </p> ] --- # Unsupervised learning .pull-left5[ Analyzes the relationships among cases (<high>clustering</high>) or among features (<high>dimensionality reduction</high>) to <high>discover structures</high> such as groups or meta-features. <table style="cellspacing:0; cellpadding:0; border:none; padding-top:10px"> <tr> <td bgcolor="white"> <b>Approach</b> </td> <td bgcolor="white"> <b>Description</b> </td> <td bgcolor="white"> <b>Example</b> </td> </tr> <tr> <td bgcolor="white"> <i>Clustering</i> </td> <td bgcolor="white"> Analyze distances between cases to identify <high>clusters of homogeneous cases</high>. </td> <td bgcolor="white"> Types of customers or patients. </td> </tr> <tr> <td bgcolor="white"> <i>Dimension-<br>ality reduction</i> </td> <td bgcolor="white"> Analyze correlations between features to identify <high>higher order features</high>. </td> <td bgcolor="white"> Dimensions of personality or user experience. </td> </tr> </table> ] .pull-right4[ <p align = "center" height=380px> <img src="image/iris_kmeans.png" height=400px><br> </p> ] --- # Reinforcement learning .pull-left5[ <high>Learns iteratively</high> from minimal supervision provided by <high>performance feedback</high>. RL is closely <high>related to psychological theories of learning</high>. <u>Examples</u> <table style="cellspacing:0; cellpadding:0; border:none;"> <col width="30%"> <col width="70%"> <tr> <td bgcolor="white"> <b>Application</b> </td> <td bgcolor="white"> <b>Description</b> </td> </tr> <tr> <td bgcolor="white"> <i>Model fitting</i> </td> <td bgcolor="white"> Iteratively <high>change model parameters</high> to improve prediction. </tr> <tr> <td bgcolor="white"> <i>Robot movements</i> </td> <td bgcolor="white"> Iteratively <high>change movement</high> patterns to increase pancake-catch probability. </tr> <tr> <td bgcolor="white"> <i>Games</i> </td> <td bgcolor="white"> Iteratively <high>change controller input</high> patterns to improve Mario Kart racing time. </tr> </table> ] .pull-right4[ <p align = "center"> <img src="image/roboarm.gif" width=320px><br> <font style="font-size:10px">from <a href="https://giphy.com/explore/reinforcement-learning">giphy.com</a></font> </p> <p align = "center"> <img src="image/mariokart.gif" width=320px><br> <font style="font-size:10px">from <a href="https://blogs.nvidia.com/blog/2017/04/14/tensorkart-ai-mario-kart/">nvidia.com</a></font> </p> ] --- # Reinforcement learning .pull-left5[ <high>Learns iteratively</high> from minimal supervision provided by <high>performance feedback</high>. RL is closely <high>related to psychological theories of learning</high>. <u>Examples</u> <table style="cellspacing:0; cellpadding:0; border:none;"> <col width="30%"> <col width="70%"> <tr> <td bgcolor="white"> <b>Application</b> </td> <td bgcolor="white"> <b>Description</b> </td> </tr> <tr> <td bgcolor="white"> <i>Model fitting</i> </td> <td bgcolor="white"> Iteratively <high>change model parameters</high> to improve prediction. </tr> <tr> <td bgcolor="white"> <i>Robot movements</i> </td> <td bgcolor="white"> Iteratively <high>change movement</high> patterns to increase pancake-catch probability. </tr> <tr> <td bgcolor="white"> <i>Games</i> </td> <td bgcolor="white"> Iteratively <high>change controller input</high> patterns to improve Mario Kart racing time. </tr> </table> ] .pull-right4[ <br><br> <iframe width=550px height=310px src="https://www.youtube.com/embed/tXlM99xPQC8" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> ] --- # Machine learning is more than algorithms <p align = "center"> <img src="image/mlsteps.png" height=440px><br> <font style="font-size:10px">from <a href="https://www.houseofbots.com/images/news/11493/cover.png">houseofbots.com</a></font> </p> --- class: middle, center <h1><a href=https://cdsbasel.github.io/dataanalytics/menu/materials.html>Materials</a></h1>