安東尼的筆記屋: Week2

Multiple Features

有多個變數的Linear Regression稱為multivariate linear regression
符號說明如下：

\begin{align*}x_j^{(i)} &= \text{value of feature } j \text{ in the }i^{th}\text{ training example} \newline x^{(i)}& = \text{the input (features) of the }i^{th}\text{ training example} \newline m &= \text{the number of training examples} \newline n &= \text{the number of features} \end{align*}

其hypothesis function的形式為：

\begin{align*}h_\theta (x) = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \theta_3 x_3 + \cdots + \theta_n x_n\end{align*}

也可用matrix的形式來表示：

\begin{align*}h_\theta(x) =\begin{bmatrix}\theta_0 \hspace{2em} \theta_1 \hspace{2em} ... \hspace{2em} \theta_n\end{bmatrix}\begin{bmatrix}x_0 \newline x_1 \newline \vdots \newline x_n\end{bmatrix}= \theta^T x\end{align*}

Gradient Descent For Multiple Variables

演算法如下：

\begin{align*} & \text{repeat until convergence:} \; \lbrace \newline \; & \theta_0 := \theta_0 - \alpha \frac{1}{m} \sum\limits_{i=1}^{m} (h_\theta(x^{(i)}) - y^{(i)}) \cdot x_0^{(i)}\newline \; & \theta_1 := \theta_1 - \alpha \frac{1}{m} \sum\limits_{i=1}^{m} (h_\theta(x^{(i)}) - y^{(i)}) \cdot x_1^{(i)} \newline \; & \theta_2 := \theta_2 - \alpha \frac{1}{m} \sum\limits_{i=1}^{m} (h_\theta(x^{(i)}) - y^{(i)}) \cdot x_2^{(i)} \newline & \cdots \newline \rbrace \end{align*}

Feature Scaling

當Feature的range很大時，Descent的速度會很慢
Input Variable的理想range

−1 ≤ x(i) ≤ 1
大約就好，太大太小都不好

Mean normalization

μi 是feature (i)的平均值
si 是the range of values (max - min)或標準差

Normal Equation

為了得到Cost Function J的最小值，可以針對各個θi做偏微分後設成0，就可以求出θ的最佳解如下：

\begin{align*}\theta = (X^T X)^{-1}X^T y\end{align*}

X, y的例子如下

使用Normal Equation就不需要做Feature Scaling
Gradient Descent跟Normal Equation的比較如下：

Gradient Descent	Normal Equation
Need to choose alpha	No need to choose alpha
Needs many iterations	No need to iterate
O (kn2)	O (n3), need to calculate inverse of XTX
Works well when n is large	Slow if n is very large

實作上當n超過10000的時候就會考慮用Gradient Descent

Normal Equation Noninvertibility

如果X^T X是不可逆的，可能的原因如下：

有重複的Feature出現 (如線性相關的兩個Feature(ft/m))
Feature數量太多 (m≤ n)

刪除一些Feature或做Regularization

安東尼的筆記屋

2017年9月18日星期一

Week2

Multiple Features

Gradient Descent For Multiple Variables

Normal Equation

Normal Equation Noninvertibility

沒有留言:

張貼留言

熱門文章

2017年9月18日 星期一

Week2

Multiple Features

Gradient Descent For Multiple Variables

Normal Equation

Normal Equation Noninvertibility

沒有留言:

張貼留言

2017年9月18日星期一