Materials
Session 1: Data exploration
Session 2: Statistical modeling & Markdown
Statistical modeling
GitHub
- Create GitHub account at: https://github.com/
- Download GitHub client at: https://desktop.github.com/ (Linux: see this)
- Open the GitHub client and sign into your account
- Fork this repository: https://github.com/cdsbasel/DataAnalytics4PB. See fork button in the top right of the screen.
- Clone the forked repository using your GitHub account from your GitHub client. The repo address should be "YOURACCOUNTNAME/DataAnalytics4PB"
- Now you should have a folder called "DataAnalytics4PB" in your Github folder on your harddrive (e.g., ~/Documents/GitHub)
Markdown
Session 3: Machine learning
College data set description
The data are taken from the College
dataset in the ISLR
package. They contain statistics for a large number of US Colleges from the 1995 issue of US News and World Report.
Variable |
Description |
Private |
A factor with levels No and Yes indicating private or public university. |
Apps |
Number of applications received. |
Accept |
Number of applications accepted. |
Enroll |
Number of new students enrolled. |
Top10perc |
Pct. new students from top 10% of H.S. class. |
Top25perc |
Pct. new students from top 25% of H.S. class. |
F.Undergrad |
Number of fulltime undergraduates. |
P.Undergrad |
Number of parttime undergraduates. |
Outstate |
Out-of-state tuition. |
Room.Board |
Room and board costs. |
Books |
Estimated book costs. |
Personal |
Estimated personal spending. |
PhD |
Pct. of faculty with Ph.D.’s. |
Terminal |
Pct. of faculty with terminal degree. |
S.F.Ratio |
Student/faculty ratio. |
perc.alumni |
Pct. alumni who donate. |
Expend |
Instructional expenditure per student. |
Grad.Rate |
Graduation rate. |