Quantitative Investment Analysis - Chapter 8
library(tidyverse)
library(knitr)
library(kableExtra)
In this post we will solve the end of the chapter practice problems from chapter 7 of the book.
Problem 1
Julie Moon is an energy analyst examining electricity, oil, and natural gas consumption in different regions over different seasons. She ran a regression explaining the variation in energy consumption as a function of temperature. The total variation of the dependent variable was 140.58, the explained variation was 60.16, and the unexplained variation was 80.42. She had 60 monthly observations.
- A Compute the coefficient of determination.
- B What was the sample correlation between energy consumption and temperature?
- C Compute the standard error of the estimate of Moon’s regression model.
- D Compute the sample standard deviation of monthly energy consumption.
# total variation
t_v <- 140.58
# explained variation
e_v <- 60.16
# unexplained variation
ue_v <- 80.42
# Coef of determination
coef_d <- e_v / t_v
cat("The coefficient of determination is", coef_d)
## The coefficient of determination is 0.4279414
A - The coefficient of determination is 0.4279414
corr <- sqrt(coef_d)
cat("The sample correlation between energy consumption and temperature is", corr)
## The sample correlation between energy consumption and temperature is 0.6541723
B - The sample correlation between energy consumption and temperature is 0.6541723
# The standard error of the estimate is the square root of the
# coefficient of non-determination divided by it's degrees of freedom
# number of observations
n <- 60
# se
se <- sqrt(ue_v / (n - 2))
cat("The standard error of the estimate of Moon’s regression model is", se)
## The standard error of the estimate of Moon’s regression model is 1.177519
C- The standard error of the estimate of Moon’s regression model is 1.177519
# The sample variance of the dependent variable is
sv <- t_v / (n - 1)
# sample standard deviation is
s <- sqrt(sv)
cat("The sample standard deviation of monthly energy consumption is", s)
## The sample standard deviation of monthly energy consumption is 1.543604
D - The sample standard deviation of monthly energy consumption is 1.543604
Problem 4
What is the value of the coefficient of determination?
- A 0.8261.
- B 0.7436.
- C 0.8623.
B - 0.7436
Problem 5
Suppose that you deleted several of the observations that had small residual values. If you re-estimated the regression equation using this reduced sample, what would likely happen to the standard error of the estimate and the R-squared?
C - Deleting observations with small residual values will increase standard error and decrease R-Squared
Problem 6
What is the correlation between X and Y?
- A −0.7436.
- B 0.7436.
- C 0.8623.
# Coef of Determination aka R-squared
coef_d <- 0.7436
corr <- sqrt(coef_d)
corr
## [1] 0.8623224
C - 0.8623
Problem 7
Where did the F-value in the ANOVA table come from?
- A You look up the F-value in a table. The F depends on the numerator and denominator degrees of freedom.
- B Divide the “Mean Square” for the regression by the “Mean Square” of the residuals.
- C The F-value is equal to the reciprocal of the t-value for the slope coefficient.
B - Divide the “Mean Square” for the regression by the “Mean Square” of the residuals.
Problem 8
If the ratio of net income to sales for a restaurant is 5 percent, what is the predicted ratio of cash flow from operations to sales?
- A 0.007 + 0.103(5.0) = 0.524.
- B 0.077 − 0.826(5.0) = −4.054.
- C 0.077 + 0.826(5.0) = 4.207.
C - 0.077 + 0.826(5.0) = 4.207.
Problem 9
Is the relationship between the ratio of cash flow to operations and the ratio of net income to sales significant at the 5 percent level?
- A No, because the R-squared is greater than 0.05.
- B No, because the p-values of the intercept and slope are less than 0.05.
- C Yes, because the p-values for F and t for the slope coefficient are less than 0.05.
C - Yes, because the p-values for F and t for the slope coefficient are less than 0.05.
Problem 10
Did Batten’s regression analyze cross-sectional or time-series data, and what was the expected value of the error term from that regression?
A - The data are time-series and the expected value of the error term is 0
Problem 11
Based on the regression, which used data in decimal form, if the CPIENG decreases by 1.0 percent, what is the expected return on Stellar common stock during the next period?
- A 0.0073 (0.73 percent).
- B 0.0138 (1.38 percent).
- C 0.0203 (2.03 percent).
# beta
b <- -0.6486
# alpha
a <- 0.0138
exp_ret <- a + b * (-0.01)
exp_ret
## [1] 0.020286
C - 0.0203
Problem 12
Based on Batten’s regression model, the coefficient of determination indicates that:
- A Stellar’s returns explain 2.11 percent of the variability in CPIENG.
- B Stellar’s returns explain 14.52 percent of the variability in CPIENG.
- C Changes in CPIENG explain 2.11 percent of the variability in Stellar’s returns.
C - Changes in CPIENG explain 2.11 percent of the variability in Stellar’s returns.
Problem 13
For Batten’s regression model, the standard error of the estimate shows that the standard deviation of:
- A the residuals from the regression is 0.0710.
- B values estimated from the regression is 0.0710.
- C Stellar’s observed common stock returns is 0.0710.
A - The residuals from the regression is 0.0710.
Problem 14
For the analysis run by Batten, which of the following is an incorrect conclusion from the regression output?
- A The estimated intercept coefficient from Batten’s regression is statistically significant at the 0.05 level.
- B In the month after the CPIENG declines, Stellar’s common stock is expected to exhibit a positive return.
- C Viewed in combination, the slope and intercept coefficients from Batten’s regression are not statistically significant at the 0.05 level
C - Viewed in combination, the slope and intercept coefficients from Batten’s regression are not statistically significant at the 0.05 level
Problem 15
Based on Exhibits 1 and 2, if Liu were to graph the 50 observations, the scatterplot summarizing this relation would be best described as:
- A horizontal.
- B upward sloping.
- C downward sloping.
C - downward sloping.
Problem 16
Based on Exhibit 1, the sample covariance is closest to:
- A −9.2430.
- B −0.1886.
- C 8.4123.
# Its sum of cross product
# of deviations from the mean
# divided by n - 1
sum_cp <- -9.2430
n <- 50
sum_cp / (n - 1)
## [1] -0.1886327
B - -0.1886327
Problem 17
Based on Exhibit 1, the correlation between the debt ratio and the short interest ratio is closest to:
- A −0.3054.
- B 0.0933.
- C 0.3054.
# R-squared
r2 <- 0.0933
# corr
corr <- sqrt(r2)
# Since the coefficient sign is negative
# our correlation is also negative
-corr
## [1] -0.3054505
A - -0.3054
Problem 18
Which of the interpretations best describes Liu’s findings for her report?
- A Interpretation 1
- B Interpretation 2
- C Interpretation 3
C - Interpretation 3
Problem 19
The dependent variable in Liu’s regression analysis is the:
- A intercept.
- B debt ratio.
- C short interest ratio.
C - short interest ratio.
Problem 20
Based on Exhibit 2, the degrees of freedom for the t-test of the slope coefficient in this regression are:
- A 48.
- B 49.
- C 50.
A - 48
Problem 21
The upper bound for the 95% confidence interval for the coefficient on the debt ratio in the regression is closest to:
- A −1.0199.
- B −0.3947.
- C 1.4528.
# Critical value at 95% level
# Two tail
# alpha
a <- 0.05
# critical level
critical <- qt(a / 2, df = 48, lower.tail = F)
upper <- -4.1589 + (critical * 1.8717)
upper
## [1] -0.3955949
B - -0.395
Problem 22
Which of the following should Liu conclude from these results shown in Exhibit 2?
- A The average short interest ratio is 5.4975.
- B The estimated slope coefficient is statistically significant at the 0.05 level.
- C The debt ratio explains 30.54% of the variation in the short interest ratio.
B - The estimated slope coefficient is statistically significant at the 0.05 level.
Problem 23
Based on Exhibit 2, the short interest ratio expected for MQD Corporation is closest to:
- A 3.8339.
- B 5.4975.
- C 6.2462.
5.4975 + (-4.1589 * 0.4)
## [1] 3.83394
A - 3.8339
Problem 24
Based on Liu’s regression results in Exhibit 2, the F-statistic for testing whether the slope coefficient is equal to zero is closest to:
- A −2.2219.
- B 3.5036.
- C 4.9367.
F = 38.4404 / 7.7867
F
## [1] 4.936674
C - 4.9367
Problem 25
Which of Vasileva’s assumptions regarding regression analysis is incorrect?
- A Assumption 1
- B Assumption 2
- C Assumption 3
C - Assumption 3
Problem 26
Based on Exhibit 1, the standard error of the estimate is closest to:
- A 0.044558.
- B 0.045850.
- C 0.050176.
# from the table
# sum of squared residual
ssr <- 0.071475
# observations 36
n = 36
sse <- sqrt(ssr/(n - 2))
sse
## [1] 0.04584982
B - 0.045850
Problem 27
Based on Exhibit 2, Vasileva should reject the null hypothesis that:
- A the slope is less than or equal to 0.15.
- B the intercept is less than or equal to 0.
- C crude oil returns do not explain Amtex share returns.
C - crude oil returns do not explain Amtex share returns.
Problem 28
Based on Exhibit 2, Vasileva should compute the:
- A coefficient of determination to be 0.4689.
- B 95% confidence interval for the intercept to be –0.0037 to 0.0227.
- C 95% confidence interval for the slope coefficient to be 0.0810 to 0.3898.
# alpha 0.05
a <- 0.05
df <- 36 -2
critical <- qt(a / 2,df = df, lower.tail = F)
upper <- 0.2354 + (critical * 0.076)
lower <- 0.2354 - (critical * 0.076)
cat("Range is", lower, "and", upper)
## Range is 0.3898506 and 0.08094942
C - 95% confidence interval for the slope coefficient to be 0.0810 to 0.3898.
Problem 29
Based on Exhibit 2 and Vasileva’s prediction of the crude oil return for month 37, the estimate of Amtex share return for month 37 is closest to:
- A –0.0024.
- B 0.0071.
- C 0.0119.
0.0095 + (0.2354 * -0.01)
## [1] 0.007146
B - 0.0071.
Problem 30
Using information from Exhibit 2, Vasileva should compute the 95% prediction interval for Amtex share return for month 37 to be:
- A –0.0882 to 0.1025.
- B –0.0835 to 0.1072.
- C 0.0027 to 0.0116.
# variance
v <- 0.0022
# sd
s <- sqrt(v)
# predicted value
p <- 0.0095 + (0.2354 * -0.01)
# We know the critical value from above
# Lower limit for prediction
lower <- p - (critical * s)
upper <- p + (critical * s)
cat("Prediction range is", lower, "and", upper)
## Prediction range is 0.1024667 and -0.08817472
A –0.0882 to 0.1025.
Problem 31
Based on Exhibit 1, Olabudo should:
- A conclude that the inflation predictions are unbiased.
- B reject the null hypothesis that the slope coefficient equals 1.
- C reject the null hypothesis that the intercept coefficient equals 0.
A - conclude that the inflation predictions are unbiased.
Problem 33
Which of Olabudo’s noted limitations of regression analysis is correct? A Only Limitation 1 B Only Limitation 2 C Both Limitation 1 and Limitation 2
C - Both Limitation 1 and Limitation 2