Optimization Fails because ..... local minima saddle point No way to go escape training 40 50 loss 00 0 Which one?。 1/2 gradient is close to zero critical point Not small updates enoughlocal minima Optimization Fails because …… updates training loss Not small enough gradient is close to zero saddle point critical point Which one? No way to go escape