average → incremental average → moving average → low pass filter / ewma → kalman filter is like a LPF with a dynamically changing $\alpha$
convolutions ↔ Fourier transforms
logistic regression generalizes to softmax regression
RNNs ~ SSMs
k-means ↔ EM algo
L2 regularization is just Gaussian Bayesian prior!
https://twitter.com/ylecun/status/1303954359403847681?lang=en