Model Representation
Notation
- = “activation” of unit in layer
- = matrix of weights controlling function mapping from layer to layer
“activation function” is the same with sigmoid function.
If networks has units in layer units in layer , then will be of dimension
Quiz
- We must compose multiple logical operations by using a hidden layer to represent the XOR function.
- Since we can build the basic AND, OR, and NOT functions with a two layer network, we can (approximately) represent any logical function by composing these basic functions over multiple layers.
- A smaller value of λ allows the model to more closely fit the training data, thereby increasing the chances of overfitting.
- A larger value of λ will shrink the magnitude of the parameters Θ, thereby reducing the chance of overfitting the data.