Data Science Questions and Answers Part-29

1. Point out the wrong statement.
a) Additive response models don’t make much sense if the response is discrete, or strictly positive
b) Transformations are often easy to interpret in linear model
c) Regression models are used to predict one variable from one or more other variables
d) All of the mentioned

Answer: b
Explanation: Transformations are often hard to interpret in linear model.

2. Which of the following component is involved in generalized linear models?
a) An exponential family model for the response
b) A systematic component via a linear predictor
c) A link function that connects the means of the response to the linear predictor
d) All of the mentioned

Answer: d
Explanation: GLM is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution.

3. Collection of exchangeable binary outcomes for the same covariate data are called _______ outcomes.
a) random
b) direct
c) binomial
d) none of the mentioned

Answer: c
Explanation: The multivariate regression model for binary outcomes gives odds ratios, not risk ratios.

4. Point out the wrong statement.
a) Asymptotics are used for inference usually
b) Adding squared terms makes it continuously differentiable at the knot points
c) Adding squared terms makes it twice continuously differentiable at the knot points
d) None of the mentioned

Answer: c
Explanation: Adding cubic terms makes it twice continuously differentiable at the knot points.

5. Which of the following is example use of Poisson distribution?
a) Analyzing contingency table data
b) Modeling web traffic hits
c) Incidence rates
d) All of the mentioned

Answer: d
Explanation: The Poisson distribution is a useful model for counts and rates.

6. How many outcomes are possible with bernoulli trial?
a) 2
b) 3
c) 4
d) None of the mentioned

Answer: a
Explanation: Bernoulli trial is a random experiment with exactly two possible outcomes.

7. Which of the following analysis is a statistical process for estimating the relationships among variables?
a) Causal
b) Regression
c) Multivariate
d) All of the mentioned

Answer: b
Explanation: Regression models provide the scientist with a powerful tool, allowing predictions about past, present, or future events to be made with information about past or present events.

8. Which of the following can be used to generate balanced cross–validation groupings from a set of data?
a) createFolds
b) createSample
c) createResample
d) none of the mentioned

Answer: a
Explanation: createResample can be used to make simple bootstrap samples.

9. Point out the wrong statement.
a) Simple random sampling of time series is probably the best way to resample times series data.
b) Three parameters are used for time series splitting
c) Horizon parameter is the number of consecutive values in test set sample
d) All of the mentioned

Answer: a
Explanation: Simple random sampling of time series is probably not the best way to resample times series data.

10. Which of the following function can create the indices for time series type of splitting?
a) newTimeSlices
b) createTimeSlices
c) binTimeSlices
d) none of the mentioned

Answer: b
Explanation: Rolling forecasting origin techniques are associated with time series type of splitting.