{ "cells": [ { "cell_type": "markdown", "id": "92234cfe", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# Regularization, Model Selection and Evaluation\n", "\n", "[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/git/https%3A%2F%2Fgitlab.in2p3.fr%2Fenergy4climate%2Fpublic%2Feducation%2Fmachine_learning_for_climate_and_energy/master?filepath=book%2Fnotebooks%2F04_regularization_selection_validation.ipynb)" ] }, { "cell_type": "markdown", "id": "42edfa17", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "
\n", " Prerequisites\n", " \n", "- [Elements of Probability Theory](appendix_elements_of_probability_theory.ipynb) \n", "- [Understand the overfitting-underfitting and bias-variance tradeoffs](3_overfitting_underfitting_bias_variance)\n", "
" ] }, { "cell_type": "markdown", "id": "75768246", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "
\n", " Learning Outcomes\n", " \n", "- Estimate the expected prediction error using cross-validation\n", "- Use regularization to prevent overfitting\n", "- Be aware of underlying statistical assumptions (identity, independence)\n", "
" ] }, { "cell_type": "markdown", "id": "56a78dba", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Cross-Validation\n", "\n", "- Data is often scarce so that living aside test data to estimat the Prediction Error (PE) is a problem\n", "- *Cross-validation* is used to estimate the Expected PE (EPE), while avoiding living aside test data\n", "- It uses part of the data to train, part of the data to test, repeating the operation on different subset selections" ] }, { "cell_type": "markdown", "id": "146b2247", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### $K$-Fold Cross Validation\n", "\n", "1. Slice the train data into $K$ roughly equal-sized parts;\n", "2. Keep the $K$th part for validation and train on the $K - 1$ other parts;\n", "3. Compute the prediction error $K$ using the $K$th part;\n", "4. Repeat the operation for all $K$ parts;\n", "5. Average the prediction errors to estimate the EPE.\n", "\n", "" ] }, { "cell_type": "markdown", "id": "867149e3", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "\n", "
\n", " Warning\n", " \n", "Cross-validation allows one to estimate **expected** prediction error rather than the prediction error conditioned on a particular training set (see Chap. 7.12 in Hastie *et al.* 2009).\n", "
" ] }, { "cell_type": "markdown", "id": "d19b6666", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Choosing the number $K$ of parts\n", "\n", "- **$K$ small** (e.g. $K = 2$):\n", "
biases towards large error (CV training sets size $(K - 1) / K \\times N$ much smaller than original training set size $N$);\n", "- **$K$ large** (e.g. $K = N$, leave-one out CV):\n", "
unbiased estimate of EPE, but with high variance ($N$ \"training sets\" are similar to each other);\n", "- **$K = 5$ or 10**:\n", "
recommended by Hastie *et al.* (2009) as a good compromise, but depends on case study." ] }, { "cell_type": "markdown", "id": "5224f338", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Adapting Cross-Validation to the Data Structure\n", "\n", "For instance:\n", "- By choosing the the number of parts $K$ to adapt to cycles (e.g. avoid splitting years)\n", "- By grouping (e.g. if measurements for different people should be kept together)\n", "- Shuffling splits (e.g. if the ordering of the data is special)\n", "- By taking serial correlations into account in time series.\n", "\n", "See https://scikit-learn.org/stable/modules/cross_validation.html#cross-validation-iterators" ] }, { "cell_type": "markdown", "id": "f3d01d2f", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Regularizing to Avoid Overfitting: Linear Models" ] }, { "cell_type": "markdown", "id": "a87795b4", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Linear Models Can Overfit\n", "\n", "- Linear models are simpler than alternatives\n", "- $\\rightarrow$ they tend to overfit less than alternatives\n", "- They even often underfit when:\n", " - $p$ is small\n", " - the problem is not linearly separable\n", " " ] }, { "cell_type": "markdown", "id": "52ef481a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "- But linear models can also overfit, when:\n", " - $N$ is small\n", " - Many uninformative features (they depend a lot on other features)" ] }, { "cell_type": "markdown", "id": "849a2a8e", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Effect of uninformative features\n", "\n", "Recal that, if we assume that generating model is $Y = \\boldsymbol{X}^\\top \\boldsymbol{\\beta} + \\epsilon$, where the observations of $\\epsilon$ are *uncorrelated* and with *mean zero* and *constant variance* $\\sigma^2$, then,\n", "\n", "$$\n", "\\begin{aligned}\n", "\\mathbb{E}(\\hat{\\boldsymbol{\\beta}} | \\mathbf{X}) &= \\boldsymbol{\\beta}\\\\\n", "\\mathrm{Var}(\\hat{\\boldsymbol{\\beta}} | \\mathbf{X}) &= \\sigma^2 (\\mathbf{X}^\\top \\mathbf{X})^{-1}.\n", "\\end{aligned}\n", "$$" ] }, { "cell_type": "markdown", "id": "210a3dd5", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**2D example:** Assume that,\n", "$$\n", "X_1 = c X_0 + (1- c) Z\\\\\n", "\\mathrm{~with~}\n", "\\mathrm{Var}(X_0) = \\mathrm{Var}(Z) = 1\n", "\\mathrm{~and~}\n", "\\mathrm{Cov}(Z, X_0) = 0,\\\\\n", "\\mathrm{Cov}(X_0, Y) = 1\n", "\\mathrm{~and~}\n", "\\mathrm{Cov}(Z, Y) = 0.\n", "$$" ] }, { "cell_type": "markdown", "id": "12660c41", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$\n", "\\mathrm{Then,~}\n", "\\frac{\\mathbf{X}^\\top \\mathbf{X}}{N - 1} \\to\n", "\\begin{pmatrix}\n", " 1 & c\\\\\n", " c & 1\n", "\\end{pmatrix}\n", "\\mathrm{~and~}\n", "\\frac{\\mathbf{X}^\\top \\mathbf{y}}{N - 1} \\to\n", "\\begin{pmatrix}\n", " 1\\\\\n", " c\n", "\\end{pmatrix} \\mathrm{~when~} N \\to \\infty.\n", "$$" ] }, { "cell_type": "markdown", "id": "9d887a4c", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$$\n", "\\mathrm{Thus,~}\n", "\\left(\\frac{\\mathbf{X}^\\top \\mathbf{X}}{N - 1}\\right)^{-1} \\to \\frac{1}{1 - c^2}\n", "\\begin{pmatrix}\n", " 1 & -c\\\\\n", " -c & 1\n", "\\end{pmatrix}\n", "\\mathrm{~and~}\n", "\\hat{\\boldsymbol{\\beta}} \\to \n", "\\begin{pmatrix}\n", " 1\\\\\n", " 0\n", "\\end{pmatrix}.\n", "$$" ] }, { "cell_type": "markdown", "id": "85368880", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Example interpretation:**\n", "\n", "- Absolute correlation large $\\rightarrow$ large coefficient variance;\n", "- Large positive correlation $\\rightarrow$ one coefficient large at expense of other.\n", "\n", "**To go further:**\n", "\n", "Interpret the role of covariances between inputs in OLS for $M > 2$ using an eigendecomposition of the covariance matrix. " ] }, { "cell_type": "markdown", "id": "635c72fb", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "**Problem:** Many correlated variables $\\rightarrow$ high variance.\n", "\n", "**Origin of the problem:** A widely large positive coefficient on one variable can be canceled by a similarly large negative coefficient on its correlated cousin $\\rightarrow$ large coefficients.\n", "\n", "
\n", " Idea: weight-decay penalization\n", " \n", "Limit the size of the coefficients to reduce the variance of the predictions.\n", "
" ] }, { "cell_type": "markdown", "id": "dbac8ae2", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Ridge Regression\n", "\n", "- A very common form of regularization for linear regression;\n", "- Shrink coefficients by imposing a penalty on their size;\n", "- Minimize a penalized RSS:\n", "\n", "\\begin{equation}\n", "\\hat{\\boldsymbol{\\beta}}^\\mathrm{ridge} = \\underset{\\boldsymbol{\\beta}}{\\mathrm{argmin}} \\left\\{\\sum_{i = 1}^N \\left(y_i - \\beta_0 - \\sum_{j = 1}^p x_{ij} \\beta_j \\right)^2 + \\lambda \\sum_{j = 1}^p \\beta_j^2 \\right\\}.\n", "\\end{equation}\n", "\n", "$\\lambda \\ge 0$ controls the amount of shrinkage." ] }, { "cell_type": "markdown", "id": "4747739a", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Ridge Regression as a Constrained Optimization\n", "\n", "For any $\\lambda$ there exists a $t \\ge 0$ such that the ridge regression problem is equivalent to:\n", "\n", "\\begin{equation}\n", "\\hat{\\boldsymbol{\\beta}}^\\mathrm{ridge} = \\underset{\\boldsymbol{\\beta}}{\\mathrm{argmin}} \\sum_{i = 1}^N \\left(y_i - \\beta_0 - \\sum_{j = 1}^p x_{ij} \\beta_j \\right)^2\\\\\n", "\\mathrm{subject~to~} \\sum_{j = 1}^p \\beta_j^2 \\le t.\n", "\\end{equation}" ] }, { "cell_type": "markdown", "id": "d51f0961", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "
\n", " Warning 1\n", " \n", "Contrary to OLS and like in many regularizations, Ridge solutions are not equivarient under scaling:\n", " \n", "- If different inputs represent different kinds of variables, one normally standardizes the inputs before solving.\n", "- If different inputs represent measurements of the same variable in different situations (location, epoch...), it may be interesting to keep the scales.\n", "
" ] }, { "cell_type": "markdown", "id": "f199c51a", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "
\n", " Warning 2\n", " \n", "- Penalization of the intercept would make the procedure depend on the origin\n", "chosen for $Y$.\n", "- After centering the inputs by replacing each $x_{ij}$ by $x_{ij} - \\bar{x}_j$, OLS gives $\\hat{\\beta}_0 = \\bar{y}$.\n", "- We can thus first solve the ridge for centered inputs and outputs and then add $\\hat{\\beta}_0 = \\bar{y}$ to the predictions.\n", "
" ] }, { "cell_type": "markdown", "id": "0501ec1c", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "> ***Question (optional)***\n", "> - Show that the solution to the ridge-penalized RSS is\n", "> \\begin{equation}\n", " \\hat{\\boldsymbol{\\beta}}^\\mathrm{ridge} = \\left(\\mathbf{X}^\\top \\mathbf{X} + \\lambda \\mathbf{I}\\right)^{-1} \\left(\\mathbf{X}^\\top \\mathbf{y}\\right),\n", "\\end{equation}\n", "> where $\\mathbf{I}$ is the $p\\times p$ identity matrix.\n", "\n", "> ***Question***\n", "> - How does this formula differ from the OLS solution for the coefficients?\n", "> - When are these solutions unique?" ] }, { "cell_type": "markdown", "id": "2e090895", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "> ***Question (optional)***\n", "> - Show that in the case of **orthonormal inputs**, the ridge estimates are just a scaled version of the OLS estimates: $\\hat{\\boldsymbol{\\beta}}^\\mathrm{ridge} = \\hat{\\boldsymbol{\\beta}} / (1 + \\lambda)$.\n", "\n", "> ***Question***\n", "> - Use this formula to interpret the effect of the Ridge regularization on the coefficients for orthonormal inputs." ] }, { "cell_type": "markdown", "id": "a5512e8e", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Singular Value Decomposition Interpretation (optional)\n", "\n", "Let the SVD of $\\mathbf{X}$ be $\\mathbf{X} = \\mathbf{U} \\mathbf{D} \\mathbf{V}^\\top$ with:\n", "- $\\mathbf{U}$ an $N \\times p$ orthogonal matrix with columns spanning the column space of $\\mathbf{X}$;\n", "- $\\mathbf{V}$ a $p \\times p$ orthogonal matrix with columns spanning the row space of $\\mathbf{X}$;\n", "- $\\mathbf{D}$ a $p \\times p$ diagonal matrix with diagonal entries $d_1 \\ge d_2 \\ge \\cdots d_p \\ge 0$ called the singular values of $\\mathbf{X}$." ] }, { "cell_type": "markdown", "id": "60257172", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "> ***Question (optional)***\n", "> - Show that the OLS estimates are such that $\\mathbf{X} \\hat{\\boldsymbol{\\beta}} = \\mathbf{U} \\mathbf{U}^\\top \\mathbf{y}$.\n", "> - Show that the ridge estimates are such that $\\mathbf{X} \\hat{\\boldsymbol{\\beta}}^\\mathrm{ridge} = \\sum_{j = 1}^p \\mathbf{u}_j \\frac{d_j^2}{d_j^2 + \\lambda} \\mathbf{u}_j^\\top \\mathbf{y}$." ] }, { "cell_type": "markdown", "id": "e83baa27", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "> ***Question***\n", "> - Interpret how the ridge shrinks the coordinates of $\\mathbf{y}$ with respect to the orthonormal basis $\\mathbf{U}$.\n", "> - What is the role of $\\mathrm{df}(\\lambda) := \\mathrm{tr}\\left[\\mathbf{X}(\\mathbf{X}^\\top \\mathbf{X} + \\lambda \\mathbf{I})^{-1} \\mathbf{X}^\\top\\right] = \\sum_{j = 1}^p \\frac{d_j^2}{d_j^2 + \\lambda}$." ] }, { "cell_type": "markdown", "id": "f8f412aa", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Regularization on a Simple Example\n", "\n", "\n", "\n", "- Small training set\n", "- Fit a linear model without regularization" ] }, { "cell_type": "markdown", "id": "55cba17d", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Regularization on a Simple Example\n", "\n", "\n", "\n", "- Small training set\n", "- Fit a linear model without regularization\n", "- Training points sampled at random\n", "- Can overfit if the data is noisy!" ] }, { "cell_type": "markdown", "id": "de6d7f9c", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Regularization on a Simple Example: Random Training Sets\n", "\n", "| | | |\n", "|---|---|---|\n", "| | | |" ] }, { "cell_type": "markdown", "id": "96135a45", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Bias-Variance Tradeoff in Ridge Regression\n", "\n", "
\n", " \n", "\n", " \n", "
\n", "\n", "**Linear** regression (OLS)\n", "
\n", "High variance, no bias\n", "
\n", "
\n", "
\n", " \n", "\n", " \n", "
\n", "\n", "**Ridge** regression\n", "
\n", "Low variance, but biased!\n", "
\n", "
\n", "\n", "
" ] }, { "cell_type": "markdown", "id": "7736c1cb", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Bias-Variance Tradeoff in Ridge Regression\n", "\n", "| | | |\n", "|---|---|---|\n", "| Too much variance | Best tradeoff | Too much bias |" ] }, { "cell_type": "markdown", "id": "babfae57", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Partial Summary\n", "\n", "- Can overfit when:\n", " - $N$ is too small and $p$ is large\n", " - In particular with non-informative features\n", "- Regularization for regression:\n", " - From linear regression to ridge regression $\\rightarrow$ less overfit\n", " - large regularization parameter $\\rightarrow$ strong regularization $\\rightarrow$ smaller coefficients" ] }, { "cell_type": "markdown", "id": "e76f6dbd", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Tuning Hyperparameters (model selection)\n", "\n", "- Some **hyperparameters** need to be estimated in addition to coefficients (regularization parameter, nonlinear coefficients in linear models, etc.)\n", "- Validation data or Cross-Validation (CV) may be used to **select the best hyperparameters** by minimizing the estimated validation error (grid search, random search, etc.)\n", "\n", "" ] }, { "cell_type": "markdown", "id": "e6a24cd7", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Testing after validation\n", "\n", "\n", "The best validation error is for hyperparameters that were **trained using the validation data**: it is **not the EPE** for this choice of hyperparameters!\n", "\n", "$\\rightarrow$ Distinguish between **train**, **validation** and **test data**\n", "\n", "Training and validation can be combined using CV but **test data should be kept to evaluate the PE conditionned on the CV set** for the final choice of coefficients and hyperparameters.\n", "\n", "To maximize the use of the data and estimate the EPE, **nested CV** can be used." ] }, { "cell_type": "markdown", "id": "190839ab", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Law of Large Numbers?\n", "\n", "Estimates attempt to minimize a function of the training error $\\overline{\\mathrm{err}}$.\n", "\n", "For estimates to converge with the sample size, so should $\\overline{\\mathrm{err}}$." ] }, { "cell_type": "markdown", "id": "aaaae356", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "$\\rightarrow$ We need some **Law of Large Numbers** to be applicable.\n", "\n", "Basic assumptions: **independent** and **identically distributed**." ] }, { "cell_type": "markdown", "id": "354d07c8", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### What could go wrong?\n", "\n", "In the natural and engineering sciences many problems depend on **time**.\n", "\n", "So far, we have assumed that the joint distribution $f_{\\boldsymbol{X}, Y}$ is **independent of time**.\n", "\n", "In particular, we have assumed that the joint process is **statistically stationary**." ] }, { "cell_type": "markdown", "id": "36e33e9e", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "Variations in time can rarely be considered purely random:\n", "\n", "$\\rightarrow$ some **dependence** persist between realizations" ] }, { "cell_type": "markdown", "id": "00f1bd2c", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Yet, we are fine if we can show that:\n", "- there is a **stationary distribution**\n", "- realizations sufficiently distant in time **no longer correlate**" ] }, { "cell_type": "markdown", "id": "e08e20cf", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "However, distributions may change with **cycles** and **trends**." ] }, { "cell_type": "markdown", "id": "c2944fc2", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Violation of Statistical Stationarity\n", "\n", "
\n", " \n", "\n", "\n", "[By RCraig09 - Own work, CC BY-SA 4.0](https://commons.wikimedia.org/w/index.php?curid=88535596)\n", "
\n", "\n", "Surface air temperature variability can be decomposed into:\n", "\n", "$-$ (pseudo-)periodic **cycles** (diurnal, annual, Milankovitch)\n", "\n", "$-$ a **continuous spectrum** of frequencies due to chaotic dynamics\n", "\n", "$-$ an increasing **trend** due to global warming\n", "\n", "$-$ other non-equilibrium variations (effect of volcanoes, solar activity, ...)" ] }, { "cell_type": "markdown", "id": "4e1afd14", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "### Partial Summary\n", "\n", "- In statistics, we **always** make assumptions about the probability distribution of the data (stationarity, independence, parametric form, etc.)\n", "- The quality of statistics (predictions, error estimates, etc.) depends on the validity of these assumptions\n", "- Even the best validation could be wrong about predictions if the new data does not satisfy these assumptions!" ] }, { "cell_type": "markdown", "id": "33262c38", "metadata": { "slideshow": { "slide_type": "subslide" } }, "source": [ "## To go further\n", "\n", "- [Linear model inspection in Scikit-learn course](https://inria.github.io/scikit-learn-mooc/python_scripts/dev_features_importance.html#linear-model-inspection);\n", "- [Introduction of the evaluation metrics in Scikit-learn course](https://inria.github.io/scikit-learn-mooc/evaluation/02_metrics.html#introduction-of-the-evaluation-metrics);\n", "- *Generalized CV* approximation of leave-one out CV (Chap. 7.10 in Hastie *et al.* 2009);\n", "- The *bootstrap* as a way of assessing the accuracy of a parameter estimate or a prediction (shuffled version of cross-validation, Chap. 7.11 and 8 in Hastie *et al.* 2009).\n", "- Avoiding validation thanks to Bayesian approach." ] }, { "cell_type": "markdown", "id": "64018dbd", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## References\n", "\n", "- [James, G., Witten, D., Hastie, T., Tibshirani, R., n.d. *An Introduction to Statistical Learning*, 2st ed. Springer, New York, NY.](https://www.statlearning.com/)\n", "- Chap. 2, 3 and 7 in [Hastie, T., Tibshirani, R., Friedman, J., 2009. *The Elements of Statistical Learning*, 2nd ed. Springer, New York.](https://doi.org/10.1007/978-0-387-84858-7)\n", "- Chap. 5 and 7 in [Wilks, D.S., 2019. *Statistical Methods in the Atmospheric Sciences*, 4th ed. Elsevier, Amsterdam.](https://doi.org/10.1016/C2017-0-03921-6)" ] }, { "cell_type": "markdown", "id": "126964e5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "***\n", "## Credit\n", "\n", "[//]: # \"This notebook is part of [E4C Interdisciplinary Center - Education](https://gitlab.in2p3.fr/energy4climate/public/education).\"\n", "Contributors include Bruno Deremble and Alexis Tantet.\n", "Several slides and images are taken from the very good [Scikit-learn course](https://inria.github.io/scikit-learn-mooc/).\n", "\n", "
\n", "\n", "
\n", " \n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", "\n", "\"Logo\n", " \n", "
\n", "\n", "
\n", "\n", "
\n", " \"Creative\n", "
This work is licensed under a   Creative Commons Attribution-ShareAlike 4.0 International License.\n", "
" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.12" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autoclose": true, "autocomplete": false, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": {}, "toc_section_display": true, "toc_window_display": false }, "varInspector": { "cols": { "lenName": 16, "lenType": 16, "lenVar": 40 }, "kernels_config": { "python": { "delete_cmd_postfix": "", "delete_cmd_prefix": "del ", "library": "var_list.py", "varRefreshCmd": "print(var_dic_list())" }, "r": { "delete_cmd_postfix": ") ", "delete_cmd_prefix": "rm(", "library": "var_list.r", "varRefreshCmd": "cat(var_dic_list()) " } }, "types_to_exclude": [ "module", "function", "builtin_function_or_method", "instance", "_Feature" ], "window_display": false } }, "nbformat": 4, "nbformat_minor": 5 }