For example, in generative adversarial network, we often hear that inference is easy because the conditional distribution of x given latent variable z is ‘tractable’. Also, I read somewhere that Boltzmann machine and variational autoencoder is used where the posterior distribution is not tractable so some sort of approximation need to be applied. Could anyone tell me what ‘tractable’ means, in a rigorous definition? Or could anyone explain in any of the examples I gave above, what tractable exactly means in that context?
To my best memory, I’ve never come across a formal definition for this in a statistical text, but I think you can stitch one together from a few contextual readings. Start with Bayesian Data Analysis, p. 261:
Bayesian computation revolves around two steps: computation of the posterior distribution, p(θ|y), and computation of the posterior predictive distribution, p(ˆy|y). So far we have considered examples where these could computed analytically in closed form[.]
The obstacle is generally the marginal likelihood, the denominator on the right-hand side of Bayes’ rule, which could involve an integral that cannot be analytically expressed. For a more I think you’ll find wiki’s article on closed-form expression helpful for context (emphasis mine):
In mathematics, a closed-form expression is a mathematical expression that can be evaluated in a finite number of operations. It may contain constants, variables, certain “well-known” operations (e.g., + − × ÷), and functions (e.g., nth root, exponent, logarithm, trigonometric functions, and inverse hyperbolic functions), but usually no limit. The set of operations and functions admitted in a closed-form expression may vary with author and context.
Problems are said to be tractable if they can be solved in terms of a closed-form expression.
If you read on, you’ll see a table of classes of expressions, and “Analytic expressions” includes several involved in the normalizing constants of exponential family distributions. E.g., the gamma function in the gamma distribution, and the Bessel function in the von-Mises Fisher.
Meaning, we’re willing to admit at least these into our definition of “tractability.” (There may be other distributions that involve the classes of operations classified as “analytic expressions”; I confess I’m not familiar.)