Data Science Questions and Answers Part-9

1. Which of the following is used to compute the percent change over a given number of periods?
a) pct_change
b) percent_change
c) per_change
d) none of the mentioned

  Discussion

Answer: a
Explanation: Series, DataFrame, and Panel all have a method pct_change.

2. Point out the correct statement.
a) Pandas represents timestamps in microsecond resolution
b) Pandas is 100% thread safe
c) For Series and DataFrame objects, var normalizes by N-1 to produce unbiased estimates
d) all of the mentioned

  Discussion

Answer: c
Explanation: Pandas represents timestamps in nanosecond resolution.

3. Which of the following object has a method cov to compute covariance between series?
a) Series
b) DataFrame
c) Panel
d) none of the mentioned

  Discussion

Answer: a
Explanation: DataFrame has a method cov to compute pairwise covariances among the series in the DataFrame, also excluding NA/null values.

4. Which of the following specifies the required minimum number of observations for each column pair in order to have a valid result?
a) min_periods
b) max_periods
c) minimum_periods
d) all of the mentioned

  Discussion

Answer: a
Explanation: DataFrame.cov also supports an optional min_periods.

5. Point out the wrong statement.
a) lxml is very fast
b) lxml requires Cython to install correctly
c) lxml does not make any guarantees about the results of it’s parse
d) None of the mentioned

  Discussion

Answer: c
Explanation: There are some versioning issues surrounding the libraries that are used to parse HTML tables in the top-level pandas io function read_html.

6. Which of the following is implemented on DataFrame to compute the correlation between like-labeled Series contained in different DataFrame objects?
a) corrwith
b) corwith
c) corwit
d) none of the mentioned

  Discussion

Answer: a
Explanation: A score close to 1 means their tastes are very similar.

7. rolling_count function gives the number of non-null observations.
a) True
b) False

  Discussion

Answer: b
Explanation: The binary operators take two Series or DataFrames.

8. Which of the following method produces a data ranking with ties being assigned the mean of the ranks for the group?
a) rank
b) dense_rank
c) partition_rank
d) none of the mentioned

  Discussion

Answer: a
Explanation: rank is also a DataFrame method.

9. Which of the following can potentially change the dtype of a series?
a) reindex_like
b) index_like
c) itime_like
d) none of the mentioned

  Discussion

Answer: a
Explanation: reindex_like silently inserts NaNs and the dtype changes accordingly.

10. cov and corr supports the optional min_periods keyword.
a) True
b) False

  Discussion

Answer: a
Explanation: Non-numeric columns will be automatically excluded from the correlation calculation.