A note while walking through the book “A Rigorous Look at Probability Theory”, and other classic statistics books, e.g. “Probability and Statistics” by Morris DeGroot et al.
xxx
xxx
xxx
xxx
In the book of “Introduction to Probability” at Harvard University, I study a little bit about probability inequalities, which might be used in learning CS229 notes of the Stanford Machine Learning course.
Theorem 10.1.1 (Cauchy-Schwarz). For any r.v.s $X$ and $Y$with finite variances:
$$ \vert E(XY) \vert \leq \sqrt{E(X^2)E(Y^2)} $$