Derivation of PRML (4.68)

Lei Yan

2019/04/25

From \((4.62)\) we know that \[ \begin{aligned} p(\mathcal{C}_k|x) &= \frac{p(x|\mathcal{C}_k)p(\mathcal{C}_k)}{\sum_j p(x|\mathcal{C}_j)p(\mathcal{C}_j)} \\ &= \frac{exp(a_k)}{\sum_j exp(a_j)} \end{aligned}\tag{1} \] Now, we should focus on the right hand of first equal sign.
Using the assumption that \(p(x|\mathcal{C}_k)\) is a normal distribution and all classes share the same covariance matrix, we have:

\[ \begin{aligned} p(\mathcal{C}_k|x) =& \frac{p(x|\mathcal{C}_k)p(\mathcal{C}_k)}{\sum_j p(x|\mathcal{C}_j)p(\mathcal{C}_j)} \\ =& \frac{(2\pi^{-D/2})|\Sigma|^{-1/2}exp\{\frac{1}{2}x^T\Sigma^{-1} x\}exp\{x^T\Sigma^{-1}\mu_k-\frac{1}{2}\mu_k^T\Sigma^{-1}\mu_k\}p(\mathcal{C}_k)}{\sum_j(2\pi^{-D/2})|\Sigma|^{-1/2}exp\{\frac{1}{2}x^T\Sigma^{-1} x\}exp\{x^T\Sigma^{-1}\mu_j-\frac{1}{2}\mu_k^T\Sigma^{-1}\mu_j\}p(\mathcal{C}_j)} \\ =& \frac{exp\{x^T\Sigma^{-1}\mu_k-\frac{1}{2}\mu_k^T\Sigma^{-1}\mu_k\}p(\mathcal{C}_k)}{\sum_jexp\{x^T\Sigma^{-1}\mu_j-\frac{1}{2}\mu_k^T\Sigma^{-1}\mu_j\}p(\mathcal{C}_j)} \end{aligned} \] Next, we can put \(p(\mathcal{C}_k)\) into \(exp\):

\[ \frac{exp\{x^T\Sigma^{-1}\mu_k-\frac{1}{2}\mu_k^T\Sigma^{-1}\mu_k + ln(p(\mathcal{C}_k))\}}{\sum_jexp\{x^T\Sigma^{-1}\mu_j-\frac{1}{2}\mu_k^T\Sigma^{-1}\mu_j+ ln(p(\mathcal{C}_j))\}} = \frac{exp(a_k)}{\sum_j exp(a_j)} \]

Therefore, \(a_k \neq ln(p(x|\mathcal{C}_k)p(\mathcal{C}_k))\), which according to \((4.63)\) the \(\neq\) should be \(=\). I think this might be an error.

So, we have \[ a_k(x) = x^T\Sigma^{-1}\mu_k-\frac{1}{2}\mu_k^T\Sigma^{-1}\mu_k + ln(p(\mathcal{C}_k) \]