Considering Deep Learning Different dims have different ranges. Wi igmoid W2 元2 W1 Sigmoid W2 。。。。e 3 W1 63 Sigmoid W2 Also difficult to optimize Feature Also need Normalization normalization 𝒂 𝟑 𝒂 𝟐 𝑎 𝑊1 1 𝑊1 𝑊1 𝒛 𝟏 𝒛 𝟐 𝒛 𝟑 𝑊2 𝑊2 𝑊2 Sigmoid …… …… …… Sigmoid Sigmoid Feature Normalization 𝒙 𝟏 𝒙 𝟐 𝒙 𝟑 Also need normalization Different dims have different ranges. Also difficult to optimize Considering Deep Learning 5