In the discussion following a recent question about whether the standard deviation can exceed the mean, one question was raised briefly but never fully answered. So I am asking it here.

Consider a set of n nonnegative numbers

xi where 0≤xi≤c for 1≤i≤n. It is not

required that the xi be distinct, that is, the set could be a multiset.

The mean and variance

of the set are defined as

ˉx=1nn∑i=1xi, σ2x=1nn∑i=1(xi−ˉx)2=(1nn∑i=1x2i)−ˉx2

and the standard deviation is σx. Note that the set

of numbers isnota sample from a population and we are

not estimating a population mean or a population variance.

The question then is:What is the maximum

value of σxˉx, the coefficient of variation, over

all choices of the xi‘s in the interval [0,c]?The maximum value that I can find for σxˉx is √n−1

which is achieved when n−1 of the xi have value 0 and the remaining

(outlier) xi

has value c, giving

ˉx=cn, 1n∑x2i=c2n⇒σx=√c2n−c2n2=cn√n−1.

But this does not depend on c at all, and I am wondering if larger

values, possibly dependent on both n and c, can be achieved.Any ideas? I am sure that this question has been studied in the statistical literature before, and so references, if not the actual results, would be much

appreciated.

**Answer**

Geometry provides insight and classical inequalities afford easy access to rigor.

### Geometric solution

We know, from the geometry of least squares, that ˉx=(ˉx,ˉx,…,ˉx) is the orthogonal projection of the vector of data x=(x1,x2,…,xn) onto the linear subspace generated by the constant vector (1,1,…,1) and that σx is directly proportional to the (Euclidean) distance between x and ˉx. The non-negativity constraints are linear and distance is a convex function, whence the extremes of distance must be attained at the edges of the cone determined by the constraints. This cone is the positive orthant in Rn and its edges are the coordinate axes, whence it immediately follows that all but one of the xi must be zero at the maximum distances. For such a set of data, a direct (simple) calculation shows σx/ˉx=√n.

### Solution exploiting classical inequalities

σx/ˉx is optimized simultaneously with any monotonic transformation thereof. In light of this, let’s maximize

x21+x22+…+x2n(x1+x2+…+xn)2=1n(n−1n(σxˉx)2+1)=f(σxˉx).

(The formula for f may look mysterious until you realize it just records the steps one would take in algebraically manipulating σx/ˉx to get it into a simple looking form, which is the left hand side.)

An easy way begins with Holder’s Inequality,

x21+x22+…+x2n≤(x1+x2+…+xn)max

(This needs no special proof in this simple context: merely replace one factor of each term x_i^2 = x_i \times x_i by the maximum component \max(\{x_i\}): obviously the sum of squares will not decrease. Factoring out the common term \max(\{x_i\}) yields the right hand side of the inequality.)

Because the x_i are not all 0 (that would leave \sigma_x/\bar{x} undefined), division by the square of their sum is valid and gives the equivalent inequality

\frac{x_1^2+x_2^2+\ldots+x_n^2}{(x_1+x_2+\ldots+x_n)^2} \le \frac{\max(\{x_i\})}{x_1+x_2+\ldots+x_n}.

Because the denominator cannot be less than the numerator (which itself is just one of the terms in the denominator), the right hand side is dominated by the value 1, which is achieved only when all but one of the x_i equal 0. Whence

\frac{\sigma_x}{\bar{x}} \le f^{-1}\left(1\right) = \sqrt{\left(1 \times (n – 1)\right)\frac{n}{n-1}}=\sqrt{n}.

### Alternative approach

Because the x_i are nonnegative and cannot sum to 0, the values p(i) = x_i/(x_1+x_2+\ldots+x_n) determine a probability distribution F on \{1,2,\ldots,n\}. Writing s for the sum of the x_i, we recognize

\eqalign{

\frac{x_1^2+x_2^2+\ldots+x_n^2}{(x_1+x_2+\ldots+x_n)^2} &= \frac{x_1^2+x_2^2+\ldots+x_n^2}{s^2} \\

&= \left(\frac{x_1}{s}\right)\left(\frac{x_1}{s}\right)+\left(\frac{x_2}{s}\right)\left(\frac{x_2}{s}\right) + \ldots + \left(\frac{x_n}{s}\right)\left(\frac{x_n}{s}\right)\\

&= p_1 p_1 + p_2 p_2 + \ldots + p_n p_n\\

&= \mathbb{E}_F[p].

}

The axiomatic fact that no probability can exceed 1 implies this expectation cannot exceed 1, either, but it’s easy to make it equal to 1 by setting all but one of the p_i equal to 0 and therefore exactly one of the x_i is nonzero. Compute the coefficient of variation as in the last line of the geometric solution above.

**Attribution***Source : Link , Question Author : Dilip Sarwate , Answer Author : whuber*