1. _________ generate summary statistics of different variables in the data frame, possibly within strata.
a) rename
b) summarize
c) set
d) subset
Discussion
Explanation: The dplyr package as a number of its own data types that it takes advantage of.
2. Point out the wrong statement?
a) The dplyr package was developed by Hadley Wickham of RStudio
b) The dplyr package is an optimized and distilled version of his plyr package
c) The dplyr package provides any “new” functionality to R
d) The dplyr package does not provide any “new” functionality to R
Discussion
Explanation: The dplyr package does not provide any “new” functionality to R.
3. ________ add new variables/columns or transform existing variables.
a) mutate
b) add
c) apped
d) arrange
Discussion
Explanation: arrange is used to reorder rows of a dataframe.
4. The _______ operator is used to connect multiple verb actions together into a pipeline.
a) pipe
b) piper
c) start
d) end
Discussion
Explanation: It is denoted by %>% sign.
5. The dplyr package can be installed from GitHub using the _______ package.
a) dev
b) devtools
c) devtool
d) devdel
Discussion
Explanation: The GitHub repository will usually contain the latest updates to the package and the development version.
6. The dplyr package can be installed from CRAN using __________
a) installall.packages(“dplyr”)
b) install.packages(“dplyr”)
c) installed.packages(“dplyr”)
d) installed.packages(“dpl”)
Discussion
Explanation: After installing the package it is important that you load it into your R session with the library() function.
7. Which of the following object is masked from ‘package: stats’?
a) filter
b) union
c) set difference
d) get difference
Discussion
Explanation: The following objects are masked from ‘package:base’: intersect, setdiff, setequal, union.
8. The _________ function can be used to select columns of a data frame that you want to focus on.
a) select
b) rename
c) get
d) set
Discussion
Explanation: The select() function allows you to get the few columns you might need.
9. Point out the correct statement?
a) You can also omit variables using the select() function by using the negative sign
b) The arrange() function also allows a special syntax that allows you to specify variable names based on patterns
c) Reordering rows of a data frame is normally easier to do in R
d) The dplyr package provides any “new” functionality to R
Discussion
Explanation: The arrange() function is used to reorder rows of a data frame according to one of the variables/columns.
10. ________ function is similar to the existing subset() function in R but is quite a bit faster.
a) rename
b) filter
c) set
d) subset
Discussion
Explanation: The filter() function is used to extract subsets of rows from a data frame.