{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "796c18c6",
   "metadata": {},
   "source": [
    "# Appendix: Supplementary Material"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "15f64fa4",
   "metadata": {},
   "source": [
    "## On Ordinary Least Squares (OLS)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "42b13afd",
   "metadata": {},
   "source": [
    "### How to Minimize the Residual Sum of Squares (RSS)?\n",
    "\n",
    "The predictions with parameters $\\hat{\\boldsymbol{\\beta}}$ from the input data are given by\n",
    "\n",
    "\\begin{equation}\n",
    "\\hat{\\mathbf{y}} = \\mathbf{X} \\hat{\\boldsymbol{\\beta}} = \\mathbf{X} \\left(\\mathbf{X}^\\top \\mathbf{X}\\right)^{-1} \\left(\\mathbf{X}^\\top \\mathbf{y}\\right).\n",
    "\\end{equation}\n",
    "\n",
    "The residual vector is given by $\\hat{\\mathbf{z}} = \\mathbf{y} - \\hat{\\mathbf{y}}$.\n",
    "\n",
    "> ***Question (optional)***\n",
    "> - Show that $\\hat{\\mathbf{y}}$ is the orthogonal projection of $\\mathbf{y}$ on the subspace of $\\mathbb{R}^N$ spanned by the columns of $\\mathbf{X}$ (i.e the column space of $\\mathbf{X}$) and that $\\hat{\\mathbf{z}}$ is orthogonal to this space."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3cf8e2bd",
   "metadata": {},
   "source": [
    "### Graphical Interpretation and Gram-Schmidt Algorithm\n",
    "\n",
    "By *regressing* $\\mathbf{b}$ on $\\mathbf{a}$ we mean regressing with input $\\mathbf{a}$ and target $\\mathbf{b}$.\n",
    "\n",
    "> ***Question***\n",
    "> - Regress $\\mathbf{x}$ on $\\mathbf{1}$ and compute the resulting residual $\\hat{\\mathbf{z}}_1$.\n",
    "> - Regress $\\mathbf{y}$ on $\\hat{\\mathbf{z}}_1$. The result should be familiar.\n",
    "> - Interpret the above procedure graphically.\n",
    "> - Generalize this procedure to the case of $p$ inputs and express the $j$th estimate in terms of some $\\hat{\\mathbf{z}}_j$ as $\\hat{\\beta}_j = \\hat{\\mathbf{z}_j}^\\top \\mathbf{y} / (\\hat{\\mathbf{z}_j}^\\top \\hat{\\mathbf{z}_j})$ (optional)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6c1f2b5",
   "metadata": {},
   "source": [
    "### Gauss-Markov Theorem\n",
    "\n",
    "We now assume that $Y = \\boldsymbol{X}^\\top \\boldsymbol{\\beta} + \\epsilon$, where the observations of $\\epsilon$ are *uncorrelated* and with *mean zero* and *constant variance* $\\sigma^2$.\n",
    "\n",
    "> ***Question (optional)***\n",
    "> - Express the variances of the parameter estimates in terms of the orthogonal basis of the column space of $\\mathbf{X}$ constructed above.\n",
    "> - How does the precision of $\\hat{\\beta}_j$ depend on the input data?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "335ab163",
   "metadata": {},
   "source": [
    "<div class=\"alert alert-block alert-info\">\n",
    "    <strong>Gauss-Markov Theorem</strong>\n",
    "    \n",
    "Least-squares estimates of the parameters have the smallest variance among all linear unbiased estimates. The OLS is BLUE (Best Linear Unbiased Estimator).\n",
    "</div>\n",
    "\n",
    "Let $\\tilde{\\boldsymbol{\\beta}}$ be any estimate of the parameters.\n",
    "We mean that for any linear combination defined by the vector $\\boldsymbol{a}$,\n",
    "\n",
    "\\begin{equation}\n",
    "    \\mathrm{Var}(\\boldsymbol{a}^\\top \\hat{\\boldsymbol{\\beta}}) \\le \\mathrm{Var}(\\boldsymbol{a}^\\top \\tilde{\\boldsymbol{\\beta}}).\n",
    "\\end{equation}\n",
    "\n",
    "> ***Question (optional)***\n",
    "> - Prove this theorem."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "19488e9a",
   "metadata": {},
   "source": [
    "### Confidence Intervals\n",
    "\n",
    "We now assume that the error $\\epsilon$ is a Gaussian random variable, i.e $\\epsilon \\sim N(0, \\sigma^2)$ and would like to test the null hypothesis that $\\beta_j = 0$.\n",
    "\n",
    "> ***Question (optional)***\n",
    "> - Show that $\\hat{\\boldsymbol{\\beta}} \\sim N(\\boldsymbol{\\beta}, (\\mathbf{X}^\\top \\mathbf{X}) \\sigma^2)$.\n",
    "> - Show that $(N - p - 1) \\hat{\\sigma}^2 \\sim \\sigma^2 \\ \\chi^2_{N - p - 1}$, a chi-squared distribution with $N - p - 1$ degrees of freedom. \n",
    "> - Show that $\\hat{\\boldsymbol{\\beta}}$ and $\\hat{\\sigma}^2$ are statistically independent."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f1f51bd5",
   "metadata": {},
   "source": [
    "With $v_j = [(\\mathbf{X}^\\top \\mathbf{X})^{-1}]_{jj}$, we define the *standardized coefficient* or *Z-score*\n",
    "\\begin{equation}\n",
    "z_j = \\frac{\\hat{\\beta}_j}{\\hat{\\sigma} \\sqrt{v_j}}.\n",
    "\\end{equation}\n",
    "\n",
    "> ***Question (optional)***\n",
    "> - Show that $z_j$ is distributed as $t_{N - p - 1}$ (a Student's-$t$ distribution with $N - p - 1$ degrees of freedom).\n",
    "> - Show that the $1 - 2 \\alpha$ confidence interval for $\\beta_j$ is $(\\hat{\\beta}_j - z^{(1 - \\alpha)}_{N - p - 1} \\hat{\\sigma} \\sqrt{v_j}, \\hat{\\beta}_j + z^{(1 - \\alpha)}_{N - p - 1} \\hat{\\sigma} \\sqrt{v_j})$, where $z^{(1 - \\alpha)}_{N - p - 1}$ is the $(1 - \\alpha)$ percentile of $t_{N - p - 1}$."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.0"
  },
  "latex_envs": {
   "LaTeX_envs_menu_present": true,
   "autoclose": true,
   "autocomplete": false,
   "bibliofile": "biblio.bib",
   "cite_by": "apalike",
   "current_citInitial": 1,
   "eqLabelWithNumbers": true,
   "eqNumInitial": 1,
   "hotkeys": {
    "equation": "Ctrl-E",
    "itemize": "Ctrl-I"
   },
   "labels_anchors": false,
   "latex_user_defs": false,
   "report_style_numbering": false,
   "user_envs_cfg": false
  },
  "toc": {
   "base_numbering": 1,
   "nav_menu": {},
   "number_sections": true,
   "sideBar": true,
   "skip_h1_title": false,
   "title_cell": "Table of Contents",
   "title_sidebar": "Contents",
   "toc_cell": false,
   "toc_position": {},
   "toc_section_display": true,
   "toc_window_display": false
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}