安東尼的筆記屋: Week4

2017年9月20日星期三

Week4

Neural Networks

在model中，input是$x_1\cdots x_n$的features，output是hypothesis function的結果
$x_0$ input node有時稱為bias unit，其值永遠為1
在neural network中，會使用跟classification一樣的logistic function

$\frac{1}{1 + e^{-\theta^Tx}}$
有時稱之為sigmoid (logistic) activation function

"theta" parameters有時也稱為"weights"
最簡單的表示方式如下：

$\begin{bmatrix}x_0 \newline x_1 \newline x_2 \newline \end{bmatrix}\rightarrow\begin{bmatrix}\ \ \ \newline \end{bmatrix}\rightarrow h_\theta(x)$
input nodes位於input layer (layer 1)，中間經過另一個node (layer 2)，最後產出hypothesis function位於output layer
在input跟output layer之間可能會有不只一層layer，稱其為hidden layers

我們把hidden layer的node稱為activation unit
假設有一層hidden layer，會長的像下面這樣：
每一個activation node的值如下：
每一個layer有它自己的$\Theta^{(j)}$，則其dimension的定義如下：

假設layer j有$s_j$個unit，layer j+1有$s_{j+1}$個unit，則$\Theta^{(j)}$的dimension為$s_{j+1} \times (s_j + 1)$
+1是因為多考慮bias node $x_0$跟$\Theta_0^{(j)}$
output不須考慮bias node，只有input需要考慮

用vector的方式處理：

將activation node以vector表示：$a(j)=g(z(j))$
而$z^{(j)} = \Theta^{(j-1)}a^{(j-1)}$
若還要計算下一層，就在$a^{(j)}$中加入一個bias unit
就可以計算$z^{(j+1)} = \Theta^{(j)}a^{(j)}$

Examples and Intuitions I

一個簡單的例子是以neural network預測$x_1$ AND $x_2$的結果，function會是如下：

$x_0$是bias node，其值為1

Theta matrix則會是：$\Theta^{(1)} =\begin{bmatrix}-30 & 20 & 20\end{bmatrix}$
sigmoid function在z>4後趨近於1，z < 4後趨近於0

所以可以算出如下結果：

Examples and Intuitions II

在上一節中，AND/OR/NOR都可以無須hidden layer就算出來
但XNOR就需要多一層hidden layer才能求出，中間透過AND/OR/NOR的轉換

Multiclass Classification

假設最後的結果不是只有2類，而是4類，那可用大小為4的vector來表示：

$h_\Theta(x)$ 會是這四種可能的vector其中之一

沒有留言:

張貼留言

訂閱：張貼留言 (Atom)