Session 5: Derive Information with dplyr
In this session, you will continue learning data manipulation with dplyr
.
Go through the RStudio Primer on Derive Information with dplyr, and complete the assignments below.
Using the taxation dataset, complete the following tasks:
Create a new column called income_labels
from income_mean
, where you have 4 new income categories below 50000, 5000-79999, 8000-109999, and above 110000. You can code the categories as numbers from 1 to 4. (hint: use case_when()
in mutate()
when you make the new columns.)
How many quarters are there for each of the four categories by year? Use group_by()
and summarize()
to answer. You can make a table like this:
year | income_labels | number_quarters |
---|---|---|
2001 | 1 | 7 |
2001 | 2 | 8 |
2001 | 3 | 6 |
2002 | 1 | 6 |
2002 | 2 | 9 |
2002 | 3 | 6 |
2003 | 1 | 6 |
2003 | 2 | 9 |
2003 | 3 | 6 |
2004 | 1 | 6 |
2004 | 2 | 10 |
2004 | 3 | 5 |
2005 | 1 | 6 |
2005 | 2 | 9 |
2005 | 3 | 6 |
2006 | 1 | 6 |
2006 | 2 | 9 |
2006 | 3 | 6 |
2007 | 1 | 6 |
2007 | 2 | 9 |
2007 | 3 | 5 |
2007 | 4 | 1 |
2008 | 1 | 6 |
2008 | 2 | 9 |
2008 | 3 | 5 |
2008 | 4 | 1 |
2009 | 1 | 6 |
2009 | 2 | 9 |
2009 | 3 | 5 |
2009 | 4 | 1 |
2010 | 1 | 6 |
2010 | 2 | 9 |
2010 | 3 | 5 |
2010 | 4 | 1 |
2011 | 1 | 6 |
2011 | 2 | 9 |
2011 | 3 | 6 |
2012 | 1 | 6 |
2012 | 2 | 9 |
2012 | 3 | 6 |
2013 | 1 | 4 |
2013 | 2 | 11 |
2013 | 3 | 5 |
2013 | 4 | 1 |
2014 | 1 | 4 |
2014 | 2 | 11 |
2014 | 3 | 5 |
2014 | 4 | 1 |
2015 | 1 | 3 |
2015 | 2 | 12 |
2015 | 3 | 4 |
2015 | 4 | 2 |
2016 | 1 | 2 |
2016 | 2 | 13 |
2016 | 3 | 4 |
2016 | 4 | 2 |
2017 | 1 | 2 |
2017 | 2 | 12 |
2017 | 3 | 6 |
2017 | 4 | 1 |