{ "cells": [ { "cell_type": "markdown", "id": "796c18c6", "metadata": {}, "source": [ "# Appendix: Supplementary Material" ] }, { "cell_type": "markdown", "id": "15f64fa4", "metadata": {}, "source": [ "## On Ordinary Least Squares (OLS)" ] }, { "cell_type": "markdown", "id": "42b13afd", "metadata": {}, "source": [ "### How to Minimize the Residual Sum of Squares (RSS)?\n", "\n", "The predictions with parameters $\\hat{\\boldsymbol{\\beta}}$ from the input data are given by\n", "\n", "\\begin{equation}\n", "\\hat{\\mathbf{y}} = \\mathbf{X} \\hat{\\boldsymbol{\\beta}} = \\mathbf{X} \\left(\\mathbf{X}^\\top \\mathbf{X}\\right)^{-1} \\left(\\mathbf{X}^\\top \\mathbf{y}\\right).\n", "\\end{equation}\n", "\n", "The residual vector is given by $\\hat{\\mathbf{z}} = \\mathbf{y} - \\hat{\\mathbf{y}}$.\n", "\n", "> ***Question (optional)***\n", "> - Show that $\\hat{\\mathbf{y}}$ is the orthogonal projection of $\\mathbf{y}$ on the subspace of $\\mathbb{R}^N$ spanned by the columns of $\\mathbf{X}$ (i.e the column space of $\\mathbf{X}$) and that $\\hat{\\mathbf{z}}$ is orthogonal to this space." ] }, { "cell_type": "markdown", "id": "3cf8e2bd", "metadata": {}, "source": [ "### Graphical Interpretation and Gram-Schmidt Algorithm\n", "\n", "By *regressing* $\\mathbf{b}$ on $\\mathbf{a}$ we mean regressing with input $\\mathbf{a}$ and target $\\mathbf{b}$.\n", "\n", "> ***Question***\n", "> - Regress $\\mathbf{x}$ on $\\mathbf{1}$ and compute the resulting residual $\\hat{\\mathbf{z}}_1$.\n", "> - Regress $\\mathbf{y}$ on $\\hat{\\mathbf{z}}_1$. The result should be familiar.\n", "> - Interpret the above procedure graphically.\n", "> - Generalize this procedure to the case of $p$ inputs and express the $j$th estimate in terms of some $\\hat{\\mathbf{z}}_j$ as $\\hat{\\beta}_j = \\hat{\\mathbf{z}_j}^\\top \\mathbf{y} / (\\hat{\\mathbf{z}_j}^\\top \\hat{\\mathbf{z}_j})$ (optional)." ] }, { "cell_type": "markdown", "id": "f6c1f2b5", "metadata": {}, "source": [ "### Gauss-Markov Theorem\n", "\n", "We now assume that $Y = \\boldsymbol{X}^\\top \\boldsymbol{\\beta} + \\epsilon$, where the observations of $\\epsilon$ are *uncorrelated* and with *mean zero* and *constant variance* $\\sigma^2$.\n", "\n", "> ***Question (optional)***\n", "> - Express the variances of the parameter estimates in terms of the orthogonal basis of the column space of $\\mathbf{X}$ constructed above.\n", "> - How does the precision of $\\hat{\\beta}_j$ depend on the input data?" ] }, { "cell_type": "markdown", "id": "335ab163", "metadata": {}, "source": [ "