Probability Mass Function
Conceptually, we are doing N independent draws from a categorical distribution with K categories. Let us represent the independent draws as random categorical variables for . Let us denote the number of times a particular category has been seen (for ) among all the categorical variables as . Note that . Then, we have two separate views onto this problem:
- A set of N categorical variables .
- A single vector-valued variable, distributed according to a multinomial distribution.
The former case is a set of random variables specifying each individual outcome, while the latter is a variable specifying the number of outcomes of each of the K categories. The distinction is important, as the two cases have correspondingly different probability distributions.
In both cases, the parameter of the categorical distribution is where is the probability to draw value k. p is likewise the parameter of the multinomial distribution P(x|p). Rather than specifying p directly, we give it a conjugate prior distribution, and hence it is drawn from a Dirichlet distribution with parameter vector .
By integrating out p, we obtain a compound distribution. However, the form of the distribution is different depending on which view we take.
Read more about this topic: Dirichlet-multinomial Distribution
Famous quotes containing the words probability, mass and/or function:
“Only in Britain could it be thought a defect to be too clever by half. The probability is that too many people are too stupid by three-quarters.”
—John Major (b. 1943)
“The mass are animal, in pupilage, and near chimpanzee.”
—Ralph Waldo Emerson (18031882)
“... The states one function is to give.
The bud must bloom till blowsy blown
Its petals loosen and are strown;
And thats a fate it cant evade
Unless twould rather wilt than fade.”
—Robert Frost (18741963)