Getting intuition for sigmoid.

$\psi = \nabla_\theta \log \pi$

$\log\sigma(\theta) = \log\frac{1}{1+e^{-\theta}} = -\log(1+e^{-\theta})$ $\frac{d}{d\theta}\log\sigma(\theta) = \frac{e^{-\theta}}{1+e^{-\theta}} = 1 - \sigma(\theta)$

image.png

Fisher divergence

Non-matrix version: think 1D. Fix the current policy $\pi_\theta$ and define $f(\delta) = \text{KL}(\pi_\theta ,|, \pi_{\theta+\delta})$.

Hence need SMALLER step at $\theta=0$ where $F$ is large than farther out.