Mean difference

From Wikipedia, the free encyclopedia

The mean difference is a statistical measure of dispersion and is equal to the average absolute difference of two independent values drawn from a probability distribution. A related statistic is the relative mean difference, which is the mean difference divided by the arithmetic mean. An important relationship is that the relative mean difference is equal to twice the Gini coefficient, which is defined in terms of the Lorenz curve.

The mean difference is also known as the absolute mean difference and the Gini mean difference. The mean difference is sometimes denoted by Δ or as MD. The mean deviation is a different measure of dispersion.


Contents

For a population of size n, with a sequence of values yi, i = 1 to n:

MD = \frac{1}{n^2} \Sigma_{i=1}^n \Sigma_{j=1}^n | y_i - y_j |

For a discrete probability function f(y), where yi, i = 1 to n, are the values with nonzero probabilities:

MD = \Sigma_{i=1}^n \Sigma_{j=1}^n f(y_i) f(y_j) | y_i - y_j |

For a probability density function f(x):

MD = \int_{-\infty}^\infty \int_{-\infty}^\infty f(x)\,f(y)\,|x-y|\,dx\,dy

For a cumulative distribution function F(x) with inverse x(F):

MD = \int_0^1 \int_0^1 |x(F_1)-x(F_2)|\,dF_1\,dF_2

The inverse x(F) may not exist because the cumulative density function has jump discontinuities or intervals of constant values. However, the previous formula can still apply by generalizing the definition of x(F):

x(F1) = inf {y : F(y) ≥ F1}


When the probability distribution has a finite and nonzero arithmetic mean, the relative mean difference, sometimes denoted by ∇ or RMD, is defined by:

RMD = \frac{MD}{Arithmetic\,Mean}

The relative mean difference quantifies the mean difference in comparison to the size of the mean and is a dimensionless quantity. The relative mean difference is equal to twice the Gini coefficient which is defined in terms of the Lorenz curve. This gives complementary perspectives to both the relative mean difference and the Gini coefficient, including alternative ways of calculating their values.


The mean difference is invariant to translations and negation, and varies proportionally to positive scaling. That is to say, if X is a random variable and c is a constant:

  • MD(X+c) = MD(X),
  • MD(-X) = MD(X), and
  • MD(c X) = |c| MD(X).

The relative mean difference is invariant to positive scaling, commutes with negation, and varies under translation in proportion to the ratio of the original and translated arithmetic means. That is to say, if X is a random variable and c is a constant:

  • RMD(X+c) = RMD(X) * mean(X)/(mean(X)+c) = RMD(X) / (1+c / mean(X)) for c ≠ -mean(X),
  • RMD(-X) = -RMD(X), and
  • RMD(c X) = RMD(X) for c > 0.

If a random variable has a positive mean, then its relative mean difference will always be greater than or equal to zero. If additionally, the random variable can only take on values that are greater or equal to zero, then its relative mean difference will be less than 2.


Both the standard deviation and the mean difference measure dispersion -- how spread out are the values of a population or the probabilities of a distribution. The mean difference is not defined in terms of a specific measure of central tendency, whereas the standard deviation is defined in terms of the deviation from the arithmetic mean. Because the standard deviation squares its differences, it tends to give more weight to larger differences and less weight to smaller differences compared to the mean difference. When the arithmetic mean is finite, the mean difference will also be finite, even when the standard deviation is infinite. See the examples for some specific comparisons.


For a random sample S from a random variable X, consisting of n values yi, the statistic:

MD(S) = \frac{\Sigma_{i=1}^n \Sigma_{j=1}^n | y_i - y_j |}{n(n-1)}

is a consistent and unbiased estimator of MD(X).

The statistic:

RMD(S) = \frac{\Sigma_{i=1}^n \Sigma_{j=1}^n | y_i - y_j |}{(n-1)\,\Sigma_{i=1}^n y_i}

is a consistent estimator of RMD(X), but is not, in general, unbiased.

Confidence intervals for RMD(X) can be calculated using bootstrap sampling techniques.

There does not exist, in general, an unbiased estimator for RMD(X), in part because of the difficulty of finding an unbiased estimation for multiplying by the inverse of the mean. For example, even where the sample is known to be taken from a random variable X(p) for an unknown p, and X(p) - 1 has the Bernoulli distribution, so that Pr(X(p) = 1) = 1 - p and Pr(X(p) = 2) = p, then:

RMD(X(p)) = 2p(1-p)/(1+p)

But the expected value of any estimator R(S) of RMD(X(p)) will be of the form:

E(R(S)) = \Sigma_{i=0}^n \,p^i (1-p)^{n-i} r_i

where the r i are constants. So E(R(S)) can never equal RMD(X(p)) for all p between 0 and 1.


Examples of Mean Difference and Relative Mean Difference
Distribution Parameters Mean Standard Deviation Mean Difference Relative Mean Difference
Continuous Uniform distribution a = 0 ; b = 1 1 / 2 = 0.5 \frac{1}{\sqrt{12}} ≈ 0.2887 1 / 3 ≈ 0.3333 2 / 3 ≈ 0.6667
Normal distribution μ = 1 ; σ = 1 1 1 \frac{2}{\sqrt{\pi}} ≈ 1.1284 \frac{2}{\sqrt{\pi}} ≈ 1.1284
Exponential distribution λ = 1 1 1 1 1
Pareto distribution k > 1 ; xm = 1 \frac{k}{(k-1)} \frac{1}{(k-1)}\,\sqrt{\frac{k}{(k-2)}} (for k > 2) \frac{2 k} {(k-1) (2k-1)} \, \frac{2}{2k-1}\,
Gamma distribution k ; θ \sqrt{k}\,\theta k θ (2 - 4 I 0.5 (k+1 , k)) † 2 - 4 I 0.5 (k+1 , k) †
Gamma distribution k = 1 ; θ = 1 1 1 1 1
Gamma distribution k = 2 ; θ = 1 2 \sqrt{2} ≈ 1.4142 3 / 2 = 1.5 3 / 4 = 0.75
Gamma distribution k = 3 ; θ = 1 3 \sqrt{3} ≈ 1.7321 15 / 8 = 1.875 5 / 8 = 0.625
Gamma distribution k = 4 ; θ = 1 4 2 35 / 16 = 2.1875 35 / 64 = 0.546875
Bernoulli distribution 0 ≤ p ≤ 1 p \sqrt{p(1-p)} 2 p (1-p) 2 (1-p) for p > 0
† I z (x,y) is the regularized incomplete Beta function


  • Xu, Kuan (January, 2004). "How Has the Literature on Gini's Index Evolved in the Past 80 Years?". Department of Economics, Dalhousie University. Retrieved on June 1, 2006.
  • Gini, Corrado (1912). Variabilità e Mutabilità. Bologna: Tipografia di Paolo Cuppini. 
  • Gini, Corrado (1921). "Measurement of Inequality and Incomes". The Economic Journal 31: 124-126. 
  • Chakravarty, S. R. (1990). Ethical Social Index Numbers. New York: Springer-Verlag. 
  • Mills, Jeffrey A.; Zandvakili, Sourushe (1997). "Statistical Inference via Bootstrapping for Measures of Inequality". Journal of Applied Econometrics 12: 133-150. 
  • Lomnicki, Z. A. (1952). "The Standard Error of Gini's Mean Difference". Annals of Mathematical Statistics 23: 635-637. 
  • Nair, U. S. (1936). "Standard Error of Gini's Mean Difference". Biometrika 28: 428-436. 


Advanced Search
Included Web Search Engines


Safe Search

close

Top Matching Results

Occasionally Search.com will highlight specialized results that are based on the context of your query. Examples of specialized results include specific links to news, images, or video.

Top Matching Results may highlight information from other Search.com pages, content from the CNET Network of sites, or third party content. The listings are based purely on relevance. Search.com does not receive payment for listings in this section but our partners that provide this data may get paid for listing these products.

Sponsored Links

This section contains paid listings which have been purchased by companies that want to have their sites appear for specific search terms and related content. These listings are administered, sorted and maintained by a third party and are not endorsed by Search.com.

Search Results

Search.com sends your search query to several search engines at one time and integrates the results into one list which has been sorted by relevance using Search.com's proprietary algorithm. You can customize the list of search engines included in your metasearch from the preferences.

The search engines that are used in your metasearch may allow companies to pay to have their Web sites included within the results. To view the Paid Inclusion policy for a specific search engine, please visit their Web site. Search.com does not accept payment or share revenue with any search engine partner for listings in this section.