{ "cells": [ { "cell_type": "markdown", "id": "5eb8c31b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Appendix: Matrix calculus\n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/git/https%3A%2F%2Fgitlab.in2p3.fr%2Fenergy4climate%2Fpublic%2Feducation%2Fmachine_learning_for_climate_and_energy/master?filepath=book%2Fnotebooks%2F1_introduction.ipynb)" ] }, { "cell_type": "markdown", "id": "8632bbd3", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Linear algebra and calculus" ] }, { "cell_type": "markdown", "id": "0b5d2c9b", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Throughout this class we adopt the following conventions:\n", "\n", "- Vectors are noted with bold lower case letters and are represented as columns\n", "\n", "\\begin{align}\n", " \\mathbf{x} &= \\begin{bmatrix}\n", " x_1 \\\\\n", " x_2 \\\\\n", " \\vdots \\\\\n", " x_n \\\\\n", " \\end{bmatrix}\n", "\\end{align}" ] }, { "cell_type": "markdown", "id": "6227bfea", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "- Matrix are noted with capital letters\n", "\n", "\\begin{align}\n", " \\mathbf{A} &= \\begin{bmatrix}\n", " a_{11} & a_{12} & \\cdots & a_{1p}\\\\\n", " a_{21} & a_{22} & \\cdots & a_{2p}\\\\\n", " \\vdots & \\vdots & & \\vdots\\\\\n", " a_{n1} & a_{n2} & \\cdots & a_{np}\\\\\n", " \\end{bmatrix}\n", "\\end{align}" ] }, { "cell_type": "markdown", "id": "91917904", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "The scalar product between two vectors $\\mathbf{x}$ and $\\mathbf{y}$ is given by\n", "\\begin{equation}\n", "\\mathbf{x}\\cdot \\mathbf{y} = \\mathbf{x}^\\top \\mathbf{y} = \\sum_i x_i y_i\n", "\\end{equation}\n", "\n", "The matrix vector multiplication is\n", "\\begin{align}\n", "\\begin{bmatrix}\n", "\\mathbf{A}\\mathbf{x}\n", "\\end{bmatrix}_i = \\sum_{j} A_{ij} x_j\n", "\\end{align}\n", "\n", "We then adopt the \"Jacobian\" convention or [Numerator layout](https://en.wikipedia.org/wiki/Matrix_calculus#Layout_conventions) for wich the gradient of a scalar function $f(\\mathbf{x})$ is a row vector. \n", "\n", "\\begin{align}\n", " \\frac{\\partial f}{\\partial \\mathbf{x}} &= \\nabla_\\mathbf{x} f &= \\begin{bmatrix}\n", " \\frac{\\partial f}{\\partial x_1} &\n", " \\frac{\\partial f}{\\partial x_2} &\n", " \\cdots &\n", " \\frac{\\partial f}{\\partial x_n}\n", " \\end{bmatrix}.\n", "\\end{align}" ] }, { "cell_type": "markdown", "id": "1dfc749d", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "With this convention, partial derivative of vectors with respect to scalars are column vectors:\n", "\\begin{align}\n", " \\frac{\\partial \\mathbf{x}}{\\partial y} &= \\begin{bmatrix}\n", " \\frac{\\partial x_1}{\\partial y} \\\\\n", " \\frac{\\partial x_2}{\\partial y} \\\\\n", " \\cdots \\\\\n", " \\frac{\\partial x_n}{\\partial y}\n", " \\end{bmatrix},\n", "\\end{align}" ] }, { "cell_type": "markdown", "id": "5bae8e8f", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "and vector derivative of vectors are the Jacobian matrix\n", "\n", "\\begin{align}\n", " \\frac{\\partial \\mathbf{y}}{\\partial \\mathbf{x}} &= \\begin{bmatrix}\n", " \\frac{\\partial y_1}{\\partial x_1} & \\frac{\\partial y_1}{\\partial x_2} & \\cdots & \\frac{\\partial y_1}{\\partial x_n} \\\\\n", " \\frac{\\partial y_2}{\\partial x_1} & \\frac{\\partial y_2}{\\partial x_2} & \\cdots & \\frac{\\partial y_2}{\\partial x_n} \\\\\n", " \\vdots & \\vdots & \\ddots & \\vdots \\\\\n", " \\frac{\\partial y_p}{\\partial x_1} & \\frac{\\partial y_p}{\\partial x_2} & \\cdots & \\frac{\\partial y_p}{\\partial x_n}\\\\\n", " \\end{bmatrix}.\n", "\\end{align}\n", "\n", "Such that the $(i,j)$th element is\n", "\\begin{align}\n", "\\begin{bmatrix}\n", "\\frac{\\partial \\mathbf{y}}{\\partial \\mathbf{x}} \n", "\\end{bmatrix}_{ij} = \\frac{\\partial y_i}{\\partial x_j} \n", "\\end{align}" ] }, { "cell_type": "markdown", "id": "6b72df4e", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "With these conventions, we recall the following rules (with $\\mathbf{A}$ and $\\mathbf{a}$ not a function of $\\mathbf{x}$, and $\\mathbf{u}$, $\\mathbf{v}$ funtions of $\\mathbf{x}$):\n", "\n", "\n", "\\begin{equation}\n", "\\frac{\\partial \\mathbf{A}\\mathbf{x}}{\\partial \\mathbf{x}} = \\mathbf{A}\n", "\\end{equation}\n", "\n", "\n", "\n", "Product:\n", "\n", "\\begin{equation}\n", "\\frac{\\partial \\mathbf{u}^\\top \\mathbf{v}}{\\partial \\mathbf{x}} = \\mathbf{u}^\\top \\frac{\\partial \\mathbf{v}}{\\partial \\mathbf{x}} + \\mathbf{v}^\\top\\frac{\\partial \\mathbf{u}}{\\partial \\mathbf{x}}\n", "\\end{equation}\n", "\n", "The chain rule \n", "\n", "\\begin{equation}\n", "\\frac{\\partial \\mathbf{f(g(x))}}{\\partial \\mathbf{x}} = \\frac{\\partial \\mathbf{f(g)}}{\\partial \\mathbf{g}}\\frac{\\partial \\mathbf{g(x)}}{\\partial \\mathbf{x}} \n", "\\end{equation}\n", "\n", "*Note that the order in which the operators appear matters for matrix multiplication*." ] }, { "cell_type": "markdown", "id": "a4d70e94", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "> ***Question***: Can you derive the following two formulas?\n", "> - \\begin{equation}\n", "\\frac{\\partial \\mathbf{a}u}{\\partial \\mathbf{x}} = ?\n", "\\end{equation}\n", "> - \\begin{equation}\n", "\\frac{\\partial \\mathbf{A}\\mathbf{u}}{\\partial \\mathbf{x}} = ?\n", "\\end{equation}\n", "\n", "You can check your answers [here](https://en.wikipedia.org/wiki/Matrix_calculus) and [here (Chap 5)](https://mml-book.github.io/book/mml-book.pdf)" ] }, { "cell_type": "markdown", "id": "5e186998", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "***\n", "## Credit\n", "\n", "[//]: # \"This notebook is part of [E4C Interdisciplinary Center - Education](https://gitlab.in2p3.fr/energy4climate/public/education).\"\n", "Contributors include Bruno Deremble and Alexis Tantet.\n", "\n", "
\n", "\n", "
\n", " \n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", " \n", "
\n", "\n", "
\n", "\n", "
\n", " \"Creative\n", "
This work is licensed under a   Creative Commons Attribution-ShareAlike 4.0 International License.\n", "
" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.6" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": true, "autocomplete": false, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }