Tuesday, April 25, 2023

Relu functions

Despite their practical success, Relu activations are theoretically ill-defined. This is due to the existence of numerous trivial local optima where the gradient is zero, and we are optimizing them with approximate local optimizers.

No comments:

Be human

The biggest mistake repeated throughout history has always been people in power thinking they have the right to harm the innocent for what t...