Friday, March 3, 2023

python - Initialization of weights

 The main difference between Gaussian variable (numpy.random.randn()) and uniform random variable is the distribution of the generated random numbers:

When used for weight initialization, randn() helps most the weights to Avoid being close to the extremes, allocating most of them in the center of the range.

An intuitive way to see it is, for example, if you take the sigmoid() activation function.

You’ll remember that the slope near 0 or near 1 is extremely small, so the weights near those extremes will converge much more slowly to the solution, and having most of them near the center will speed the convergence.

No comments:

Post a Comment