Throughout, we use the following notation:
• 1n is the vector of all-ones of size n,
• 0n is the vector of all-zeros of size n,
• In is the n × n identity matrix.
Problem 1 (30 points) 6 + 10 + 14 = 30
Assume that z1, z2 and z3 are independent draws from N(1, 1), that is, a normal distribution
with mean and variance both equal to 1. Consider the vector
z =
z1 − z3
z2
.
(a) Fully specify the distribution of z.
(b) Can we choose a ∈ R
3
such that a
T z is independent of z1 − z2? If so specify one such
a, otherwise argue why not.
Let x = [x1, x2]
T be obtained by projecting z onto the span of the following set of vectors:
1
−1
,
−1
1
.
(c) Find the (joint) distribution of the random vector
z
x
.
1
Problem 2 (40 points) 8 + 8 + 8 + 16 = 40
Assume that the data y is generated according to the following model,
y = β1×1 + β2×2 + ε,
where y, ε, xj ∈ R
n and βj ∈ R for j = 1, 2. Assume that ε ∼ N(0, 2In) and
kx1k
2 = kx2k
2 = n, hx1, x2i = −
1
2
n,
and n = 10. Suppose that we mistakenly pose and fit the regression model y ∼ γx1. That
is, we compute the LSE estiamte
γb = argmin
γ∈R
ky − γx1k
2
and form the fitted-value vector µˆ = γbx1 and the residual vector e = y − γbx1.
Answer the following questions. Justify all your answers.
(a) For each of the following vectors, determine whether e is guaranteed to be orthogonal
to it or not:
x1, x2, µˆ, x1 + µˆ
(b) Is γb unbiased for estimating β1?
(c) Find the variance of γb.
(d) Find the joint distribution of
γb
e
.
2
Problem 3 (30 points) 6 + 12 + 12 = 30
Consider linear regression model y = Xβ + ε with transposed design matrix XT given by
t(X)
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,] 0 1 0 -1 -1 0 0 -1 0 0
## [2,] -1 0 1 0 0 -1 0 0 0 -1
## [3,] -1 1 0 -1 0 1 0 0 0 0
Note that the model does not include an intercept.
We fit a regression model with input y and design matrix X and obtain the following
estimate of the coefficient vector β:
coef(lm(y ~ 0 + X))
## X1 X2 X3
## -0.305 0.880 0.840
We also observe that kyk
2 = 5.462436. Answer the following questions:
(a) Find s
2
, the unbiased estimate of the noise variance.
(b) Test the null hypothesis 4β1 = β2 + β3 at level α = 0.05.
(c) What would the least-sqaures estimate of β be if we fit a regression model with response
vector y + x3 and design matrix X?
3