想请教各位大侠 KL divergence的问题,wiki上说 KL measures the expected number of extra bits required to code samples from P when using a code based on Q, rather than using a code based on P. Typically P represents the "true" distribution of data, observations, or a precisely calculated theoretical distribution. The measure Q typically represents a theory, model, description, or approximation of P.
公式是
通常P是unknown的,Q是一个对P的approximation, 也就是说Q的概率是可求出来的,但是对于unknown的P,如何求出相对熵呢?也就是说这里p(x)是怎么得到的呢?
谢谢指导
本文来自: 人大经济论坛 SPSS专版 版,详细出处参考:
https://bbs.pinggu.org/forum.php?mod=viewthread&tid=1283500&page=1&from^^uid=1523309