Wrangling

# Wrangling
### <a href='https://cdsbasel.github.io/dataanalytics/'> Data Analytics for Psychology and Business </a> <a href='https://cdsbasel.github.io/dataanalytics/menu/materials.html'> </a>  <a href='https://cdsbasel.github.io/dataanalytics/'> </a>  <a href='mailto:rui.mata@unibas.ch'> 
### February 2019

---

<div class="my-footer">
 
 
 <img src="https://raw.githubusercontent.com/therbootcamp/therbootcamp.github.io/master/_sessions/_image/by-sa.png" height=14 style="vertical-align: middle"/>
 
 <a href="https://cdsbasel.github.io/dataanalytics/">
 
 
 cdsbasel.github.io/dataanalytics/
 
 
 </a>
 <a href="https://cdsbasel.github.io/dataanalytics/">
 
 Data Analytics for Psychology and Business | February 2019
 
 </a>
 
 </div>

---

# Datentypen ausserhalb von R

<table class="tg" style="cellspacing:0; cellpadding:0; border:none;" width="95%">
<col width=30%>
<col width=30%>
<col width=30%>
<tr>
 <td bgcolor = 'white' style='vertical-align:top'>
 <ul>
 <li class="m1"><high>Structured data</high>
 <ul class="level">
 <li>Delimiter separated: <mono>.csv</mono>, <mono>.txt</mono>, etc.</li>
 <li>Relational databases: <mono>SQL</mono></li>
 </ul>
 <img src="image/structured.png" height=250px>
 </li>
 </ul>
 </td>
 <td bgcolor = 'white' style='vertical-align:top'>
 <ul>
 <li class="m2"><high>Semi-structured data</high>
 <ul class="level">
 <li>Markup: <mono>.xml</mono>, <mono>.xls</mono>, <mono>.html</mono> etc.</li>
 <li>Non markup: <mono>JSON</mono>, <mono>MongoDB</mono></li>
 </ul>
 <img src="image/html.png" height=250px>
 </li>
 </ul>
 </td>
 <td bgcolor = 'white' style='vertical-align:top'>
 <ul>
 <li class="m3"><high>Unstructued data</high>
 <ul class="level">
 <li>z.B. Text</li>
 </ul>
 <br2><img src="image/text.png" height=250px>
 </li>
 </ul>
 </td>
 </tr>
</table>

---

# Delimiter separated data

<ul>
 <li class="m1"><high>Delimiter</high> separates columns.</li>
 <li class="m2">Typically available as <high>local text file</high>.</li>
 <li class="m3"><high>data types</high> are inferred.</li>
</ul>

]

]

---

# Delimiter separated data

]

```r
# Lese Basel Datensatz ein
basel <- read_csv("1_Data/basel.csv")

# Benutze expliziten Delimiter
basel <- read_delim("1_Data/basel.csv",
 delim = ",")
basel
```

```
## # A tibble: 10,000 x 20
## id geschlecht alter groesse
## <dbl> <chr> <dbl> <dbl>
## 1 1 f 87 165 
## 2 2 m 54 175.
## 3 3 f 34 147.
## 4 4 m 31 166.
## 5 5 m 24 180.
## # … with 9,995 more rows, and 16
## # more variables
```

]

---

# Identify file path

<ul>
 <li class="m1">Identify file path using RStudio's <high>auto-complete</high>.</li>
 <li class="m2">Place cursor between quotation marks and press <highm>tab</highm>.</li>
</ul>

]

]

---

# Den Filepath finden

]

]

---

# Inferred data types

```r
# Read basel data set
basel <- read_csv("1_Data/basel.csv")
```

```
## Parsed with column specification:
## cols(
##   .default = col_double(),
##   geschlecht = col_character(),
##   bildung = col_character(),
##   konfession = col_character(),
##   fasnacht = col_character(),
##   sehhilfe = col_character()
## )
```

```
## See spec(...) for full column specifications.
```

]

]

---

# Inferred data types

<ul>
 <li class="m1">Sometimes <mono>readr</mono> needs a little help to <high>correctly identify data types</high>.</li>
</ul>

```r
# Explicitly define missing values
basel <- read_csv("1_Data/basel.csv",
 na = c('NA'))

# Re-infer data types
basel <- type_convert(basel)
```
]

]

---

# Semi-structured data mit <a href="https://github.com/r-lib/xml2"><mono>xml2</mono></a> und <a href="https://github.com/hadley/rvest"><mono>rvest</mono></a>

```r
# Load table from Wikipedia (with xml2 and rvest)
read_html("https://en.wikipedia.org/wiki/R_(programming_language)") %>%
  html_node(xpath = '//*[@id="mw-content-text"]/div/table[2]') %>%
  html_table() %>% as_tibble()
```

```
## # A tibble: 17 x 3
## Release Date Description 
## <chr> <chr> <chr> 
## 1 0.16 "" "This is the last alpha version developed primarily by Ihaka and G…
## 2 0.49 "1997-04-2… "This is the oldest source release which is currently available on…
## 3 0.60 "1997-12-0… "R becomes an official part of the GNU Project. The code is hosted…
## 4 0.65.1 "1999-10-0… "First versions of update.packages and install.packages functions …
## 5 1.0 "2000-02-2… "Considered by its developers stable enough for production use.[50…
## 6 1.4 "2001-12-1… "S4 methods are introduced and the first version for Mac OS X is m…
## 7 1.8 "2003-10-0… "Introduced a flexible condition handling mechanism for signalling…
## 8 2.0 "2004-10-0… "Introduced lazy loading, which enables fast loading of data with …
## 9 2.1 "2005-04-1… "Support for UTF-8 encoding, and the beginnings of internationaliz…
## 10 2.11 "2010-04-2… "Support for Windows 64 bit systems." 
## # … with 7 more rows
```

---

# Other data siehe <a href="https://cran.r-project.org/web/packages/rio/vignettes/rio.html">rio</a>

```r
# read fixed width files (can be fast)
data <- read_fwf(file, ...)

# read Apache style log files
data <- read_log(file, ...)
```

### `haven` <img src="image/haven.png" width="50" align="right">

```r
# read SAS's .sas7bat and sas7bcat files
data <- read_sas(file, ...)

# read SPSS's .sav files
data <- read_sav(file, ...)

# etc
```
]

```r
# read Excel's .xls and xlsx files
data <- read_excel(file, ...)
```
 
### Other

```r
# Read Matlab .mat files
data <- R.matlab::readMat(file, ...)

# Read and wrangle .xml and .html
data <- XML::xmlParseParse(file, ...)

# from package jsonlite: read .json files
data <- jsonlite::read_json(file, ...)
```
]

---

# What is Wrangling?

<img src="image/wrangling.jpeg" height=450px> 
from <a href="https://datasciencebe.com/tag/data-wrangling/">datasciencebe.com</a>

---

# This is Wrangling!

<ul>
 <li class="m1"><high>Transform</high>
 
 <ul class="level">
 <li>Rename columns names</li>
 <li>Create new variables</li>
 </ul></li>
 <li class="m2"><high>Organize</high>
 
 <ul class="level">
 <li>Sort</li>
 <li>Join data sets</li>
 <li>Flip columns and rows</li>
 </ul></li>
 <li class="m3"><high>Aggregate</high>
 
 <ul class="level">
 <li>Create groups</li>
 <li>Calculate statistics for groups</li>
 </ul></li>
</ul>

]

]

---

# 2 'dirty' data sets

<ul>
 <li class="m1"><high>Rename</high>: Add intuitive column names</li>
 <li class="m2"><high>Recode</high>: Change to appropriate units.</li>
 <li class="m3"><high>Join</high>: Join datasets.</li>
 <li class="m4"><high>Sort</high>: Sort datasets.</li>
 <li class="m5"><high>Filter</high>: Select relevant cases.</li>
 <li class="m6"><high>Select</high>: Select relevant variables.</li>
</ul>

]

```r
patients
```

```
## # A tibble: 5 x 3
## id X1 X2
## <dbl> <dbl> <dbl>
## 1 1 37 1
## 2 2 65 2
## 3 3 57 2
## 4 4 34 1
## 5 5 45 2
```

```r
results
```

```
## # A tibble: 5 x 3
## id t_1 t_2
## <dbl> <dbl> <dbl>
## 1 4 100 105
## 2 92 134 150
## 3 1 123 135
## 4 2 143 140
## 5 99 102 68
```

]

---

# The mighty `tidyverse`

The [`tidyverse`](https://www.tidyverse.org/) is a collection of high-performing, user-friendly R packages, created explicitly for efficient data analytics. 
1. `ggplot2` for graphics
2. <high><mono>dplyr</mono> for data wrangling</high>.
3. <high><mono>tidyr</mono> for data wrangling</high>.
4. `readr` for data I/O.
5. `purrr` for function programming.
6. `tibble` for modern `data.frame`s.

---

# <mono>`%>%`</mono>

<ul>
 <li class="m1">The preferred way of using `dplyr` includes a <high>novel operator</high>, the pipe <highm>%>%</highm>.</li>
</ul>

<img src="image/pipe.jpg" width = "300px"> 
from <a href="https://upload.wikimedia.org/wikipedia/en/thumb/b/b9/MagrittePipe.jpg">wikimedia.org</a>

]

```r
# Numerical vector
score <- c(8, 4, 6, 3, 7, 3)
score
```

```
## [1] 8 4 6 3 7 3
```

```r
# mean: Base-R-style
mean(score)
```

```
## [1] 5.167
```

```r
# mean: dplyr-style
score %>%  
  mean()  
```

```
## [1] 5.167
```

]

---

# `%>%`

<ul>
 <li class="m1">The preferred way of using `dplyr` includes a <high>novel operator</high>, the pipe <highm>%>%</highm>.</li>
</ul>

]

]

---

# Transform

<ul>
 <li class="m1"><high>Rename</high>: Choose intuitive column names.
 
 <ul class="level">
 <li><mono>rename()</mono></li>
 </ul>
 </li>
 <li class="m2"><high>Recode</high>: Choose appropriate units and labels. 
 
 <ul class="level">
 <li><mono>mutate()</mono></li>
 <li><mono>case_when()</mono></li>
 </ul>
 </li>
 <li class="m3"><high>Join</high>: Join datasets
 
 <ul class="level">
 <li><mono>left_join()</mono></li>
 </ul>
 </li>
</ul>

]

```r
patients
```

```
## # A tibble: 5 x 3
## id X1 X2
## <dbl> <dbl> <dbl>
## 1 1 37 1
## 2 2 65 2
## 3 3 57 2
## 4 4 34 1
## 5 5 45 2
```

```r
results
```

```
## # A tibble: 5 x 3
## id t_1 t_2
## <dbl> <dbl> <dbl>
## 1 4 100 105
## 2 92 134 150
## 3 1 123 135
## 4 2 143 140
## 5 99 102 68
```

]

---

# `rename()`

```r
patients %>%
  rename(NEW = OLD,
         NEW = OLD)
```

]

```r
# Starte mit Datensatz
patients %>%

# Ändere Spaltennamen
  rename(age = X1,
         condition = X2)
```

```
## # A tibble: 5 x 3
## id age condition
## <dbl> <dbl> <dbl>
## 1 1 37 1
## 2 2 65 2
## 3 3 57 2
## 4 4 34 1
## 5 5 45 2
```

]

---

# `mutate()`

```r
tibble %>%
  mutate(
   NAME1 = DEFINITION1,
   NAME2 = DEFINITION2,
   NAME3 = DEFINITION3,
   ...
  )
```
]

```r
patients %>% 
  rename(age = X1, 
         condition = X2) %>%
  
# Kreiere neue Variablen
  mutate(monate = age * 12,
         dekaden = age / 10)
```

```
## # A tibble: 5 x 5
## id age condition monate dekaden
## <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 1 37 1 444 3.7
## 2 2 65 2 780 6.5
## 3 3 57 2 684 5.7
## 4 4 34 1 408 3.4
## 5 5 45 2 540 4.5
```

]

---

# `case_when()`

```r
tibble %>%
  mutate(
    NAME = case_when(
      LOGICAL1 ~ VALUE1,
      LOGICAL2 ~ VALUE2,
      ...
      )
    )
```

]

```r
patients %>% 
  rename(age = X1, 
         condition = X2) %>%
  
# Create cond_label from condition
  mutate(cond_label = case_when(
    condition == 1 ~ "placebo",
    condition == 2 ~ "medication"))
```

```
## # A tibble: 5 x 4
## id age condition cond_label
## <dbl> <dbl> <dbl> <chr> 
## 1 1 37 1 placebo 
## 2 2 65 2 medication
## 3 3 57 2 medication
## 4 4 34 1 placebo 
## 5 5 45 2 medication
```

]

---

# Join datasets
 

 <img src="image/joining_data.png" height="450px">

---

# `inner_join()`

```r
TIBBLE1 %>%
  inner_join(TIBBLE2, 
             by = c("KEY"))
```

]

```r
patients %>% 
  rename(age = X1, condition = X2) %>%
  mutate(cond_label = case_when(
    condition == 1 ~ "placebo",
    condition == 2 ~ "medication")) %>%
  
  # Verbinde mit ergebnisse
  inner_join(ergebnisse, by = "id")
```

```
## Error in tbl_vars_dispatch(x): object 'ergebnisse' not found
```
]

---

# `left_join()`

```r
TIBBLE1 %>%
  left_join(TIBBLE2, 
            by = c("KEY"))
```

]

```r
patients %>% 
  rename(age = X1, condition = X2) %>%
  mutate(cond_label = case_when(
    condition == 1 ~ "placebo",
    condition == 2 ~ "medication")) %>%
  
  # Verbinde mit ergebnisse
  left_join(ergebnisse, by = "id")
```

```
## Error in tbl_vars_dispatch(x): object 'ergebnisse' not found
```
]

---

# Organize

<ul>
 <li class="m4"><high>Sortieren</high>: Sort data.
 
 <ul class="level">
 <li><mono>arrange()</mono></li>
 </ul>
 </li>
 <li class="m5"><high>Filtern</high>: Select relevant cases.
 
 <ul class="level">
 <li><mono>slice()</mono></li>
 <li><mono>filter()</mono></li>
 </ul>
 </li>
 <li class="m6"><high>Auswählen</high>: Select relevant variables.
 
 <ul class="level">
 <li><mono>select()</mono></li>
 </ul>
 </li>
</ul>

]

```r
# Joined tibble
patient_results
```

```
## # A tibble: 5 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 1 37 1 placebo 123 135
## 2 2 65 2 medication 143 140
## 3 3 57 2 medication NA NA
## 4 4 34 1 placebo 100 105
## 5 5 45 2 medication NA NA
```

]

---

# `arrange()`

```r
# Sort ascending
tibble %>%
  arrange(VAR1, VAR2)

# Sort descending (with desc())
tibble %>%
  arrange(desc(VAR1), VAR2)
```

]

```r
patient_results %>%
  
  # Sort according to condition
  arrange(condition)
```

```
## # A tibble: 5 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 1 37 1 placebo 123 135
## 2 4 34 1 placebo 100 105
## 3 2 65 2 medication 143 140
## 4 3 57 2 medication NA NA
## 5 5 45 2 medication NA NA
```

]

---

# `arrange()`

```r
# Sort ascending
tibble %>%
  arrange(VAR1, VAR2)

# Sort descending (with desc())
tibble %>%
  arrange(desc(VAR1), VAR2)
```
]

```r
patient_results %>%
  
  # Sort according to both
  arrange(condition, age) 
```

```
## # A tibble: 5 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 4 34 1 placebo 100 105
## 2 1 37 1 placebo 123 135
## 3 5 45 2 medication NA NA
## 4 3 57 2 medication NA NA
## 5 2 65 2 medication 143 140
```

]

---

# `slice()`

```r
# Slice with sequence
patient_results %>%
  slice(INDEX_START:INDEX_STOP)

# Slice with vector  
patient_results %>%
  slice(c(INDEX1, INDEX2, ...))
```

]

```r
patient_results %>%
  arrange(condition, age) %>%

# Lines 3 and 5 
  slice(c(3, 5))
```

```
## # A tibble: 2 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 5 45 2 medication NA NA
## 2 2 65 2 medication 143 140
```

]

---

# `slice()`

```r
# Slice with sequence
patient_results %>%
  slice(INDEX_START:INDEX_STOP)

# Slice with vector  
patient_results %>%
  slice(c(INDEX1, INDEX2, ...))
```

]

```r
patient_results %>%
  arrange(condition, age) %>%

# First 4 rows
  slice(1:4)
```

```
## # A tibble: 4 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 4 34 1 placebo 100 105
## 2 1 37 1 placebo 123 135
## 3 5 45 2 medication NA NA
## 4 3 57 2 medication NA NA
```

]

---

# `filter()`

```r
patient_results %>%
 filter(VAR1 == VALUE1,
 VAR2 > VALUE2,
 VAR3 < VALUE3,
 VAR4 == VALUE4 | VAR5 < VALUE5)
```
]

```r
patient_results %>%
  
  # Patients with age > 35
  filter(age > 35)
```

```
## # A tibble: 4 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 1 37 1 placebo 123 135
## 2 2 65 2 medication 143 140
## 3 3 57 2 medication NA NA
## 4 5 45 2 medication NA NA
```

]

---

# `filter()`

```r
patient_results %>%
 filter(VAR1 == VALUE1,
 VAR2 > VALUE2,
 VAR3 < VALUE3,
 VAR4 == VALUE4 | VAR5 < VALUE5)
```

]

```r
# Age larger 35 & cond_label is medication
patient_results %>%
  filter(age > 35,
         cond_label == "medication")
```

```
## # A tibble: 3 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 2 65 2 medication 143 140
## 2 3 57 2 medication NA NA
## 3 5 45 2 medication NA NA
```

]

---

# `select()`

```r
# Select variables
tibble %>% 
  select(VAR1, VAR2)

# Select all but  one variable
tibble %>% 
  select(-VAR1)
```

]

```r
patient_results %>%
  
  # Select id and condition
  select(id, condition)
```

```
## # A tibble: 5 x 2
## id condition
## <dbl> <dbl>
## 1 1 1
## 2 2 2
## 3 3 2
## 4 4 1
## 5 5 2
```

]

---

# `select()`

```r
# Select variables
tibble %>% 
  select(VAR1, VAR2)

# Select all but one variable
tibble %>% 
  select(-VAR1)
```

]

```r
patient_results %>%
  
  # Everything but id
  select(-id)
```

```
## # A tibble: 5 x 5
## age condition cond_label t_1 t_2
## <dbl> <dbl> <chr> <dbl> <dbl>
## 1 37 1 placebo 123 135
## 2 65 2 medication 143 140
## 3 57 2 medication NA NA
## 4 34 1 placebo 100 105
## 5 45 2 medication NA NA
```

]

---

# `starts_with()`

```r
# Select variables
tibble %>% 
  select(starts_with("PATTERN"))
```

]

```r
patient_results %>%
  
  # Select variables starting with "t"
  select(starts_with("t"))
```

```
## # A tibble: 5 x 2
## t_1 t_2
## <dbl> <dbl>
## 1 123 135
## 2 143 140
## 3 NA NA
## 4 100 105
## 5 NA NA
```

]

---

# `contains()`

```r
# Select variables
tibble %>% 
  select(contains("PATTERN"))
```

]

```r
patient_results %>%
  
  # Select variables that contain "-"
  select(contains("_"))
```

```
## # A tibble: 5 x 3
## cond_label t_1 t_2
## <chr> <dbl> <dbl>
## 1 placebo 123 135
## 2 medication 143 140
## 3 medication NA NA
## 4 placebo 100 105
## 5 medication NA NA
```

]

---

# `pivot_longer()`

```r
# wide to long
tibble %>% 
  pivot_longer(cols = VARS,
               names_to = NAME1,
               values_to = NAME2)
```

]

```r
# wide to long
patient_results %>% 
  filter(cond_label == "placebo")
```

```
## # A tibble: 2 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 1 37 1 placebo 123 135
## 2 4 34 1 placebo 100 105
```
]

---

# `pivot_longer()`

```r
# wide to long
DATA %>% 
  pivot_longer(cols = VARS,
               names_to = NAME1,
               values_to = NAME2)
```

]

```r
# wide to long
patient_results %>% 
  filter(cond_label == "placebo") %>%
  pivot_longer(cols = c("t_1", "t_2"),
               names_to = "zeit",
               values_to = "messung")
```

```
## # A tibble: 4 x 6
## id age condition cond_label zeit messung
## <dbl> <dbl> <dbl> <chr> <chr> <dbl>
## 1 1 37 1 placebo t_1 123
## 2 1 37 1 placebo t_2 135
## 3 4 34 1 placebo t_1 100
## 4 4 34 1 placebo t_2 105
```

]

---

# `pivot_wider()`

```r
# long to wide
tibble %>% 
  pivot_wider(names_from = VAR1,
              values_from = VAR2)
```
]

```r
# long to wide
patient_results_lang
```

]

---

# `pivot_wider()`

```r
# long to wide
tibble %>% 
  pivot_wider(names_from = VAR1,
              values_from = VAR2)
```
]

```r
# long to wide
patient_results_lang %>%
    pivot_wider(names_from = "zeit",
                values_from = "messung")
```

```
## # A tibble: 2 x 6
## id age condition cond_label t_1 t_2
## <dbl> <dbl> <dbl> <chr> <dbl> <dbl>
## 1 1 37 1 placebo 123 135
## 2 4 34 1 placebo 100 105
```

]

---

# Aggregate

<ul>
 <li class="m1"><high>Aggregate</high>
 
 <ul class="level">
 <li><mono>summarise()</mono></li>
 <li><mono>summarise_if()</mono></li>
 <li><mono>group_by(), summarise()</mono></li>
 <li><mono>n(), first(), last(), nth()</mono></li>
 <li><mono>pull()</mono></li>
 </ul>
 </li>
</ul>

]

```r
patient_results
```

```
## # A tibble: 5 x 6
## id age condition cond_label t_1
## <dbl> <dbl> <dbl> <chr> <dbl>
## 1 1 37 1 placebo 123
## 2 2 65 2 medication 143
## 3 3 57 2 medication NA
## 4 4 34 1 placebo 100
## 5 5 45 2 medication NA
## # … with 1 more variable: t_2 <dbl>
```

]

---

# `summarise()`

```r
tibble %>%
  summarise(
    NAME1 = SUMMARY_FUN(VAR1),
    NAME2 = SUMMARY_FUN(VAR2)
  )
```

]

```r
patient_results %>%
  
  # descriptive stats
  summarise(
    mean_age = mean(age),
    median_t1 = median(t_1, 
                       na.rm = TRUE)
  )
```

```
## # A tibble: 1 x 2
## mean_age median_t1
## <dbl> <dbl>
## 1 47.6 123
```

]

---

# Grouped Aggregation

---

# `group_by()`, `summarise()`

```r
tibble %>%
  group_by(GROUP_VAR) %>%
  summarise(
    NAME1 = SUMMARY_FUN(VAR1),
    NAME2 = SUMMARY_FUN(VAR2)
  )
```

]

```r
patient_results %>%
  
  # Group according to cond
  group_by(cond_label) %>%
  
  # Descriptive stats
  summarise(
    mean_age = mean(age),
    median_t1 = median(t_1, 
                       na.rm = TRUE)
  )
```

```
## # A tibble: 2 x 3
## cond_label mean_age median_t1
## <chr> <dbl> <dbl>
## 1 medication 55.7 143 
## 2 placebo 35.5 112.
```

]

---

# `n()`

```r
tibble %>%
  group_by(GRUPPEN_VAR) %>%
  summarise(
    NAME1 = SUMMARY_FUN(VAR1),
    NAME2 = SUMMARY_FUN(VAR2)
  )
```

]

```r
patient_results %>%
  
  # Group according to cond
  group_by(cond_label) %>%
  
  # Descriptive stats
  summarise(
    N = n()
  )
```

```
## # A tibble: 2 x 2
## cond_label N
## <chr> <int>
## 1 medication 3
## 2 placebo 2
```

]

---

<h1><a href="https://therbootcamp.github.io/R4DS_2019Feb/_sessions/Wrangling/Wrangling_practical.html">Practical</a></h1>