I have asked about this before and have really been struggling with identifying what makes a model parameter and what makes it a latent variable. So looking at various threads on this topic on this site, the main distinction seems to be:
Latent variables are not observed but have an associated probability distribution with them as they are variables and parameters are also not observed and have no distribution associated with them which I understand as that these are constants and have a fixed but unknown value that we are trying to find. Also, we can put priors on the parameters to represent our uncertainty about these parameters even though there is only one true value associated with them or at least that is what we assume. I hope I am correct so far?
Now, I have been looking at this example for Bayesian weighted linear regression from a journal paper and been really struggling to understand what is a parameter and what is a variable:
Here x and y are observed but only y is treated as a variable i.e. has a distribution associated with it.
Now, the modelling assumptions are:
So, the variance of y is weighted.
There is also a prior distribution on β and w, which are normal and gamma distributions respectively.
So, the full log likelihood is given by:
Now, as I understand it both β and w are model parameters. However, in the paper they keep referring to them as latent variables. My reasoning is β and w are both part of the probability distribution for the variable y and they are model parameters. However, the authors treat them as latent random variables. Is that correct? If so, what would be the model parameters?
The paper can be found here (http://www.jting.net/pubs/2007/ting-ICRA2007.pdf).
The paper is Automatic Outlier Detection: A Bayesian Approach by Ting et al.
In the paper, and in general, (random) variables are everything which is drawn from a probability distribution. Latent (random) variables are the ones you don’t directly observe (y is observed, β is not, but both are r.v). From a latent random variable you can get a posterior distribution, which is its probability distribution conditioned to the observed data.
On the other hand, a parameter is fixed, even if you don’t know its value. Maximum Likelihood Estimation, for instance, gives you the most likely value of your parameter. But it gives you a point, not a full distribution, because fixed things do not have distributions! (You can put a distribution on how sure you are about this value, or in what range you thing this value is, but this is is not the same as the distribution of the value itself, which only exists if the value is actually a random variable)
In a Bayesian setting, you can have all of them. Here, parameters are things like the number of clusters; you give this value to the model, and the model considers it a fixed number. y is a random variable because it is drawn from a distribution, and β and w are latent random variables because they are drawn from probability distributions as well. The fact that y depends on β and w doesn’t make them “parameters”, it just makes y dependent on two random variables.
In the paper they consider that β and w are random variables.
In this sentence:
These update equations need to be run iteratively until all parameters
and the complete log likelihood converge to steady values
in theory they talk about the two parameters, not the ones that are random variables, since in EM this is what you do, optimizing over parameters.