In Bayesian statistics, parameters are said to be random variables while data are said to be nonrandom. Yet if we look at the Bayesian updating formula

$$

p(\theta|y)=\frac{p(\theta)p(y|\theta)}{p(y)},

$$

we find probability (density or mass) conditioned on the data as well as the conditional and unconditional probability (density or mass) of the data itself.How does it make sense to consider probability (density or mass) conditioned on a constant or probability (density or mass) of a constant?

**Answer**

The Bayesian approach to (parametric) statistical inference starts from a statistical model, ie a family of parametrised distributions,

$$X\sim F_\theta,\qquad\theta\in\Theta$$

and it introduces a supplementary probability distribution on the parameter

$$\theta\sim\pi(\theta)$$

The posterior distribution on $\theta$ is thus defined as the conditional distribution of $\theta$ conditional on $X=x$, the observed data. This construction clearly relies on the **assumption that the data is a realisation of a random variable with a well-defined distribution**. It would otherwise be impossible to define a conditional distribution like the posterior, since there would be no random variable to condition upon.

The possible confusion may stem from the fact that a difference between Bayesian and frequentist approaches is that frequentist procedures are evaluated and compared based on their frequency properties, ie by averaging over all possible realisations, instead of conditional on the actual realisation, as the Bayesian approach does. For instance, the frequentist risk of a procedure $\delta$ for a loss function $L(\theta,d)$ is

$$R(\theta,\delta) = \mathbb E_\theta[L(\theta,\delta(X))]$$

while the Bayesian posterior loss of a procedure $\delta$ for the prior $\pi$ is

$$\rho(\delta(x),\pi) = \mathbb E^\pi[L(\theta,\delta(x))|X=x]$$

**Attribution***Source : Link , Question Author : Richard Hardy , Answer Author : Xi’an*