LDA

The Model - The Generative Story

$$ \begin{align*} & p(\beta_{1:K}, \theta_{1:D}, z_{1:D}, W_{1:D}) \\ =& \underbrace{\Pi_{i=1}^{K} p(\beta_i)}{\text{topic priors}} \Pi{d=1}^{D} p(\theta_d) ( \Pi_{n=1}^{N} \underbrace{p(z_{d,n} \vert \theta_d)}{\text{topic assignment}} p(w{d,n} \vert \underbrace{\beta_{1:K}, z_{d,n}}{z{d,n}\text{ picks a topic from }\beta_{1:K}}) ) \end{align*} $$

$\beta_i$ is the topic distribution of the $i$-th topic among $K$, it is a $V$probabilistic simplex over the vocabulary; formally, $\beta_i \in \mathbb{R}^V$ where $\sum_{j=1}^{V} \beta_{ij} = 1$, $\beta_{ij} \geq 0$.
$\theta_d$ is the topic proportions for the $d$-th document, where $\theta_{d,k}$ is the topic proportion w.r.t. the $k$-th topic in the $d$-th document; formally, $\theta_d \in \mathbb{R}^K$, where $\sum_{k=1}^{K}\theta_{d,k} = 1$, $\theta_{d,k} \geq 0$.
$z_d$ is the topic assignment of the $d$-th document, and $z_{d,n}$ is the topic assignment of the $n$-th word in the $d$-th document; I wonder whether the topic assignment $z_{d,n} \in [K]$?
$w_d$ is the words in the $d$-th document, and $w_{d,n}$ is the $n$-th word of the $d$-th document, i.e. think of $w_{d,n} \in [V]$.

“Notice that this distribution specifies a number of dependencies. For example, the topic assignment $z_{d,n}$ depends on the per-document topic proportions $\theta_d$. As another example, the observed word $w_{d,n}$ depends on the topic assignment $z_{d,n}$ and all of the topics $\beta_{1:K}$. (Operationally, that term is defined by looking up as to which topic $z_{d,n}$ refers to and looking up the probability of the word $w_{d,n}$ within that topic.)”

Posterior Computation

$$ p(\beta_{1:K}, \theta_{1:D}, z_{1:D} \vert w_{1:D}) = \frac{p(\beta_{1:K}, \theta_{1:D}, z_{1:D}, w_{1:D})}{\underbrace{p(w_{1:D})}_{\text{evidence}}} $$

“Topic modelling algorithms form an approximation of [the above equation] by adapting an alternative distribution over the latent topic structure to be close to the true posterior”

Reference

[PTM2012] Probabilistic Topic Model, Communications of the ACM, David M. Blei.

According to [PTM].