A random variable that induces a σ\sigma-algebra the same as the one in the sample space

Consider a probability space (Ω,F,P) where Ω is the sample space, F is the σ-algebra of Ω, and P is the probability measure. Let X:ΩR be a random variable, inducing a σ-algebra FX={X1(B)BB} where B is the Borel algebra on R.

Suppose we have another random variable Y as well as the induced σ-algebra FY, the conditional expectation can be defined as E[Y|X]a.s.=g where g is determined by E[YIA]=Ag(s)P(s)ds,AFX according to Radon–Nikodym theorem.

I wonder what will happen if FX=F already? It looks like E[Y|X]=Y so X is kind of independent of any arbitrary random variable but I am not sure. Or, can anybody give an example of this kind of X? Thank you.

Some references: Theory of Statistics by Mark J. Schervish.

enter image description here


When you chase the definitions, the issues become trivial (although perhaps still unintuitive):

  • E(YX)=E(YFX) by definition.

  • For any subalgebra GF, E(YG) is defined to be any G-measurable function for which


    for every GG.

Therefore, whenever FX=F, it must be the case that

  1. E(YX) is F-measurable and

  2. FE(YX)(ω)dP(ω)=FY(ω)dP(ω) for every FF.

The equality of the integrals for all measurable sets implies (as is well known and established early in any account of Lebesgue integration) that E(YX) must equal Y almost surely (a.s.): they can differ only on a set of measure zero.

The second part of the question requests an example. Let’s construct a very simple but not entirely trivial one. It concerns a finite binomial process used to model (among other things) changes in prices of a financial asset over time. For simplicity, I have restricted it to a sequence of two times during which the price could go up (+) or down (), whence

  • Ω can be identified with the set {++,+,+,}.

  • F consists of all subsets of Ω (the discrete algebra).

  • P is determined by its values on the atoms, written p++=P({++}), etc.


Let Y be the price of the asset after the first time and X be its price after the second time. (These natural and meaningful descriptions show this is not some pathological construction we’re about to review.)

The figure displays this model as a binary tree in which the individual (conditional!) probabilities label the branches, the elements of Ω are the four possible paths from left to right through the tree, and the values of Y and X are indicated at the points where they are determined.

Suppose all four prices assigned by X are distinct. Then, since any individual price is measurable in B(R), FX contains all the atoms, whence it consists of F itself. But Y can assign at most two distinct prices, Y+=Y(++)=Y(+) and Y=Y(+)=Y(). The inverse images of these two prices then are the sets +1={++,+} and 1={+,}. They generate a strict subalgebra of F: it has four measurable sets and does not include any of the atoms. It describes what is “known” after the first time but before the second one.

The definition of conditional expectation needs to be checked only on a basis for FX. The set of its atoms is most convenient. Here is an example of a calculation for the atom {+}:


The parallel calculations for the other atoms make it clear that for all ωΩ,


where the second equality computes the integral directly. From this we can construct two interesting examples:

  1. Suppose every outcome has nonzero probability. Then we may always divide both sides by pω, no matter what ω may be, and obtain


    The conditional expectation of Y is just Y itself.

  2. Suppose p++=p+=1/2 and p=p+=0. (This models a situation where an initial decrease is impossible.) Then we may define Y to be any value, since it does not matter (due to the impossibility of this event): the defining equality for ω=


    and its counterpart for ω=+ automatically hold. Thus, it is not necessarily the case that E(Y|F)=Y, but the set of ω where the two sides differ must have zero probability (and, of course, be measurable with respect to Y).

Looking back at the tree might supply some intuition: in conditioning Y on X, whose values were determined later, we thereby have complete information about Y along any sets of paths having nonzero probability of occurring.

Source : Link , Question Author : Ziyuan , Answer Author : whuber

Leave a Comment