Work with Data: Derive Information with dplyr

Session 5: Derive Information with dplyr

5.1 Derive Information with dplyr

In this session, you will continue learning data manipulation with dplyr.

Go through the RStudio Primer on Derive Information with dplyr, and complete the assignments below.

 

Assignment

Using the taxation dataset, complete the following tasks:

  1. Create a new column called income_labels from income_mean, where you have 4 new income categories below 50000, 5000-79999, 8000-109999, and above 110000. You can code the categories as numbers from 1 to 4. (hint: use case_when() in mutate() when you make the new columns.)

  2. How many quarters are there for each of the four categories by year? Use group_by() and summarize() to answer. You can make a table like this:

year income_labels number_quarters
2001 1 7
2001 2 8
2001 3 6
2002 1 6
2002 2 9
2002 3 6
2003 1 6
2003 2 9
2003 3 6
2004 1 6
2004 2 10
2004 3 5
2005 1 6
2005 2 9
2005 3 6
2006 1 6
2006 2 9
2006 3 6
2007 1 6
2007 2 9
2007 3 5
2007 4 1
2008 1 6
2008 2 9
2008 3 5
2008 4 1
2009 1 6
2009 2 9
2009 3 5
2009 4 1
2010 1 6
2010 2 9
2010 3 5
2010 4 1
2011 1 6
2011 2 9
2011 3 6
2012 1 6
2012 2 9
2012 3 6
2013 1 4
2013 2 11
2013 3 5
2013 4 1
2014 1 4
2014 2 11
2014 3 5
2014 4 1
2015 1 3
2015 2 12
2015 3 4
2015 4 2
2016 1 2
2016 2 13
2016 3 4
2016 4 2
2017 1 2
2017 2 12
2017 3 6
2017 4 1