Alessandro D. Gagliardi
From a Taxonomy of Data Science (by Dataists)
"[Every] non-key [attribute] must provide a fact about the key, the whole key, and nothing but the key (so help me Codd)."
| Resource | GET | PUT | POST | DELETE |
|---|---|---|---|---|
| Collection URI, such as http://example.com/resources | List the URIs and perhaps other details of the collection's members. | Replace the entire collection with another collection. | Create a new entry in the collection. The new entry's URI is assigned automatically and is usually returned by the operation. | Delete the entire collection. |
| Element URI, such as http://example.com/resources/item17 | Retrieve a representation of the addressed member of the collection, expressed in an appropriate Internet media type. | Replace the addressed member of the collection, or if it doesn't exist, create it. | Not generally used. Treat the addressed member as a collection in its own right and create a new entry in it. | Delete the addressed member of the collection. |
A functional relationship between input & response variables
A simple linear regression model captures a linear relationship between an input x and response variable y
$$ y = \alpha + \beta x + \epsilon $$
$y =$ response variable (the one we want to predict)
$x =$ input variable (the one we use to train the model)
$\alpha =$ intercept (where the line crosses the y-axis)
$\beta =$ regression coefficient (the model “parameter”)
$\epsilon =$ residual (the prediction error)
N.B. Statistics is incapable of proving that anything is true. It can only suggest that something probably isn't.
from Wikipedia:
Machine learning, a branch of artificial intelligence, is about the construction and study of systems that can learn from data.”
"The core of machine learning deals with representation and generalization..."

(too much to even summarize but...)

$P(AB) = P(A|B) \times P(B)\qquad$
$P(BA) = P(B|A) \times P(A)\qquad$ by substitution
But $P(AB) = P(BA)\qquad$ since event $AB =$ event $BA$
$\hookrightarrow P(A|B) \times P(B) = P(B|A) \times P(A)\>$ by combining the above
$\hookrightarrow P(A|B) = \frac{P(B|A) \times P(A)}{P(B)}\>$ by rearranging last step
$$ P(C | x_1, \ldots, x_n) = \frac{P(x_1, \ldots, x_n | C) \times P(C)}{P(x_1, \ldots, x_n)} $$
In plain English the above equation can be written as
$$ \mbox{posterior} = \frac{\mbox{likelihood} \times \mbox{prior}}{\mbox{evidence}} $$
Make a simplifying assumption. In particular, we assume that the features $x_1, \ldots, x_n$ are conditionally independent from each other:
$$ P(x_1, \ldots, x_n | C) \approx P(x_1 | C) \times P(x_2 | C) \times \ldots \times P(x_n|C) $$
This "naïve" assumption simplifies the likelihood function to make it tractable.
Recall our earlier discussion of overfitting.
It is a result of matching the training set too closely.
In other words, an overfit model matches the noise in the dataset instead of the signal.
source: Data Analysis with Open Source Tools, by Philipp K. Janert. O’Reilly Media, 2011
Q: How do we define the complexity of a regression model?
A: One method is to define complexity as a function of the size of the coefficients.
Ex 1: $\sum |\beta_i|$
Ex 2: $\sum \beta_i^2$
Q: How do we define the complexity of a regression model?
A: One method is to define complexity as a function of the size of the coefficients.
Ex 1: $\sum |\beta_i| \leftarrow$ this is called the L1-norm
Ex 2: $\sum \beta_i^2 \leftarrow$ this is called the L2-norm
These measures of complexity lead to the following regularization techniques:
L1 regularization: $y = \sum \beta_i x_i + \epsilon$ such that $\sum |\beta_i| < s $
L2 regularization: $y = \sum \beta_i x_i + \epsilon$ such that $\sum \beta_i^2 < s $
Regularization refers to the method of preventing overfitting by explicitly controlling model complexity.
These regularization problems can also be expressed as:
L1 regularization: $ \min(||y - x\beta||^2 + \lambda ||x||) $
L2 regularization: $ \min(||y - x\beta||^2 + \lambda ||x||^2) $
This (Lagrangian) formulation reflects the fact that there is a cost associated with regularization.
In the DAT6 folder, from the command line:
git commit -am ...
git checkout gh-pages
git pull
git checkout personal
git merge gh-pages
ipython notebook
Then open DS_Lab10-Regularization