forward propagation(3 layer)

loss function

gradient

classical 2-layer

Activation layer

back propogation

Affine layer

Softmax-with-Loss

Optimizer

Batch Normalization

Weight decay(L2 norm)

Dropout

hyperparameter choosing(random search)

CNN