The lognormal distribution

THE LOGNORMAL DISTRIBUTION (LND)

In the description of the LND normally you find as expression of P(x):

___

P(x) = (1/SxÖ2`p ). e ^–^{( ln x}^–^{l M) ^2 / 2S^2}

in which S is the s of the normal distribution (ND) of ln x. The median of the LND is M’ = e^M.

Let see that if M=0, M’ = 1.

The mode of this LND is: M₀= e ^{M-S^2} ( Aitchinson- Brown :The log normal Distribution with special...) that is the same: M₀= M’.e^{- S^2} or M’ = M₀.e^{S^2}

(Can be found too by derivation)

The expression of P(x) can be written too:

___

P(x) = (1/SxÖ2`p ). e ^–^{( ln x}^–^{ln e^ M) ^2 / 2S^2}

The exponent is too: - ln² (x/e^M) / 2S² and with M’ = e ^M

-ln² (x/M’)/ 2S².

Doing z=x/M’ x = z.M’ dx=M’dz

___

P(x)dx = (1/(S.z.M’Ö`2`p)). e^{- ln2 z /2S^2} M’dz = P(z) dz

Being z= x/ M₀ e^{s^2}

Now we can do the change of variate y= x/ M₀.

z = y e ^–^{s^2} and y = z .e ^{S^2} dy = dz .e ^{S^2} and dz = dy. e^{-S^2}

The exponent becomes : - ln² (y e ^–^{s^2}) /2S². and

____

P(z) = 1/(SzÖ`2`p) . e ^–^{(ln z^2) /2S^2} and

P(z) dz = e ^{S^2}/(Sy Ö`2`p). e^{-(ln y}^–^{S2 )^2/2S2}. e^{-s^2}dy=P(y) dy

We have (ln y –S²)²= ln²y – 2ln y.S² + S⁴ and

___

P(y) = 1/(Sy Ö`2`p). e^{- ln2 y/2S^2}.e^ln
y . e^–^{S^2/2} and as e^{ln y} = y, we have finally

___

P(y) = 1/(SÖ`2`p) . e^{(-S2^/2)}. e ^{-
ln2 y/2S^2} (1)

____

For y=1 x = M₀ the mode, and P(y)=P(ymax) =1/(SÖ`2`p) . e^-^{S^2/2}.

We have the LND referred to the mode and as we will see, it has some special properties.

First, as ln y = - ln 1/y, and ln²y = (- ln 1/y)² the probability densities of reciprocal values referred to the mode, are the same. That is, P(0.5 M₀)

is the same that P(2 M_0). It is easy to see that M₀ is the geometric mean of the distribution of P(y).

As the mean m= e ^{M
+ (S^2)/2}; m = M’.e^{(S^2)/2} = M₀.e^{S^2}.e^{(S^2)/2}= M₀. e^{(S^2).3/2}

and M’== M₀.e^{S^2}.

That gives a good way to check if a curve distribution is a LND. First read the abscissa of M₀, let be X₀; then take two points of the curve with the same ordinate and look at the two abscises. If X₁/X₀ = X₀/X₂ the distribution can be LN.

The moments with reference M₀ are: M_j = M₀ ^j. e ^{j.S^2}e^{j^2.S^2}

Figures for all the text

The relative probability density function. (RLND)

The formula (1) can be written P(y) = P(ymax). e ^{-
ln2 y/2S^2} and then

P(y) / P(ymax) = Pr(y) = e ^{-
ln2 y/2S^2}

Pr(y) is a new concept: the relative probability density of the LND (RLND) in which the variate y is referred to the mode; that is to say, y is measured in units of M₀.

As ò P(y) dy = 1 it results that ò Pr(y)dy = 1/P(ymax)

It results that it is easier to work with the relative LND referred to the mode than in the way it is normally presented. You can calculate the cumulative distribution function for Pr(y) and to find the value of P(y), one has only to multiply by the value of P(ymax).

From the statistical point of view, sometimes it is easy to estimate the mode, and if the mean has been calculated, it is immediate the calculation of the parameter S. As S has not a meaning in the distribution, I use normally a=1/S as parameter, (some authors use it too) with the name of “concentration”. Some graphics for different values of S are attached with the distributions of P(y) and Pr(y).

The relative distribution concept is also valid for the ND, being in this case if M=0 Pr(x) = e^{-(x2/2S^2)}. (RND) (2)

It can be applied to any distribution too.

Tangent from the origin.

Being a RLND, let us calculate the tangent to the curve from the origin of coordinates.

The slope of the tangent in a point of the curve Pr(x) = e ^{-
ln2 x/2S^2} is, derivating, P’r(x)= Pr(x).(-ln x /S².x) and if it has to pass through the origin, it must be P’r(x)=Pr(x)/x. Then Pr(x)/x = Pr(x). (- ln x )/S²x or -ln x/S² =1 and ln x = -S² and x= e^{-S^2}. But e^{-S^2} = M₀/M’ the inverse of the mean. This facilitates to find the mean when one has the curve of the probability distribution. It is easy to draw examples.

Some outcomes from the RND and RLND.

Elliptic Normal Distributions (END)

In (2), doing Pr(x) = y and taking logarithms we have:

ln y = - x²/2S² that is the equation of a parabola with parameter S²,

x² = -2S² ln y.

If the RND can be generated from a parabola, we can assume that exists a distribution generated from an ellipse, being the parabola a limit of an ellipse when the axis 2c (the major) and 2b (the minor, parallel to x axis) increase infinitely remaining b²/c constant. The half major axis is designed as c to avoid confusion with the concentration a.

The equation of the equivalent ellipse of the preceding parabola is:

(ln y + c)²/c² + x²/ b² =1 and from here:

_______ _____

ln y = - c(1 - Ö 1- (x/b)² ) and y = e^{- c(1 -}^Ö^{1- (x/b)2}⁾. The double sign of the root is +, because for x = 0 must be y = Pr(max) =1 . This limits the ellipse to the superior half.

This new concept of distribution can be named as: elliptic normal distribution (END).

This type of distributions are limited by the values of x =± b and they look more realistic that the ND, because in the reality there is not the possibility to find the theoretical values of ±¥ for the variate that allow the ND . The error in a measure can’t be µ because the measuring devices have always a finite size. This opinion agrees with that you indicate in a web of the ND: (http://mathworl.wolfram.com/ )

Because they occur so frequently, there is an unfortunate tendency to invoke normal distributions in situations where they may not be applicable. As Lippmann stated, "Everybody believes in the exponential law of errors: the experimenters, because they think it can be proved by mathematics; and the mathematicians, because they believe it has been established by observation" (Whittaker and Robinson 1967, p. 179).

If you draw a graphic with two ND, one END and the other ND, for the parameters c = 10, b=2 for the END and S= 0.5 for the ND, you can see than the shape of the curves is almost the same. And for higher concentrations, c=15, both curves are coincident.

Computer can do the integration easily with the Simpson method. As all the elements of the distribution must be included between –b and +b, we have:

òP (x) dx = 1. As ò Pr(x)dx =1/P(xmax)ò P(x)dx = A; then P(xmax) =1/A

Doing x/b = sin w it is possible to simplify the expression of y=Pr(x):

As dx = b cos w then Pr(x) dx = Pr (w) dw = e ^-c(1-cos
w)cos w dw.

Elliptic Log Normal Distributions (ELND)

The same conclusion can be done regarding the LND, because in this case ln y = - ln²x/2S². The equivalent ellipse will be :

(ln y + c)²/c² + ln² x/ b²=1

________

and Pr(y) = e ^–^{c (1 -}^Ö^{1- (ln2 x/b2 )}. In the case of the ELND, the limits of the variate x, with the mode=1 are e ^–^b and e ^+b. The ELND also has the property indicated above that P(1/x) = P(x). The same opinion than for the END can be done for the ELND: there are the real ones, because in the real world there are not variates which allow the ¥ value as it happens for the LND. The integration must be made between the limits e ^±^b to find P(xmax)= 1/A

With the logarithms of probability as above, it is possible to do philosophy about entropy, information and probability.

Hyperbolic distributions

In the same way than for the elliptic distributions, we can find the expression for the hyperbolic distributions:

______

Pr(x) = e ^c(1-^Ö^{1+ (f(x)/b)2)} This distribution is extended, as the Normal Distribution, from -µ to +µ. The problem is to find Pmax(x), because the integration of Pr(x) is not immediate. As it happens with the elliptic distributions, with high concentration, both distributions are coincident. In a graphic with relatives normal and hyperbolic normal distributions, the shape is almost the same.

Centrifugal and Centripetal Distributions

Looking at the expressions of elliptic N and LN distributions, it is logical to think: what about the lowest half of the ellipse? Could they be distributions with limited edges +b and –b with the maximum probability density in the borders than in the center?

With the maximum probability density at the borders, the equation of such a distribution should be:

_______ ________

ln Pr(x) = -cÖ 1- (x/b)² and ln Pr(x) = -cÖ 1- (ln x/b)²

Are there in the reality this kind of distributions? Could be. I am looking for them. In any case they are “centrifugal distributions”. It is a subject to investigate. By other side, it is logical that the distributions having points at the ¥, as the N and LN can’t be centrifugal, they must be centripetal.

GENERAL EXPRESSION OF THE LND

Let be a LND in the relative form : Pr(x) = e^{- (1 /2S^2). ln2 x}

with Pmax(x)= e^{-S^2/2}/SÖ2p Being the mode x=1

and P(x) dx = P(xmax).Pr(x) dx

If we do the variate change y = x. e^{z .S^2} being z a number between - µ and + µ, we have x = y. e^–^{z.S^2} and dx = e^–^{z.S^2} dy

Doing all the substitutions and simplifications, we arrive at

P(y) = (1/ SÖ2p).e ^{-(S^2/2).(1+z2))} . y ^-z.e ^{- ln2 y/2.S^2}

That gives P(y) for the points separated of the mode e ^{z.s^2} .

For z=0 we get the expression for the RLND referred to the mode =1

For z=1 we get the expression for the RLND referred to the median=1

For z=3/2 we get the expression for the RLND referred to the mean=1

The log normal distribution and the Plank distribution

When looking at the shape of the Plank distribution, the question arises about why it is not a log normal distribution.

The figure of the Plank distribution as appears in the mentioned web, has been converted to a relative log normal distribution. They are very similar.

Plank obtained its distribution formula adapting a curve to the observational data he had about black body intensity of radiations. I read that Plank had problems to adapt his curve to the real data

That web says that log normal distributions appear in the size of silver emulsions for photography. I know that appears too in the size of other products in grinding engines, and other distributions about size of companies and so on. See Atchinson and Brown and its Kolmogoroff references.

Then, if the radiation produces photons of different sizes, why do not accept that the size of photons have a log normal distribution? And the size is measured with the wave length.

f(x)=15/(p⁴x⁵ (e^1/x–1)

The x max =0.20145 and for that value Pmax(x) = 3.264

The relative function is f(x) /Pmax(x) with the variate change

y=x/0.20145 (Mode =1) and dx = 0.20145.dy

Once finished this report, I saw a figure about probabilities with four dice. I have adjusted a Elliptic Normal Distribution:

_________

P(n) = 0.1118745e ^-c(1-^Ö^{1-((n-14)/10)^2 )} with n varying from 4 to 24.

SEE DETAILED IN SPANISH

SUMMARY

LOG NORMAL DISTRIBUTION

Median M’ =1

____

P(x) = (1/SxÖ2`p ). e ^–^{( ln x )^2 / 2S^2}

____

P(xmax) = e^{S^2/2}/(SÖ`2`p)

Mode M₀ =1 M₀ = M’e^{-S^2} m= M₀ e^–^{3/2(S^2)}

_____ ___

P(x) = 1/(SÖ`2`p) . e^{(-S2^/2)}. e ^{-
ln2 x / 2S^2} P(xmax) = e^{-S^2/2}/(SÖ`2`p)

RELATIVE LOG NORMAL DISTRIBUTION

____

Pr(x) =P(x)/P(xmax)= e ^{-
ln2 x/2S^2} M₀ =1 P(xmax) = e^{-S^2/2}/(SÖ`2`p)

RELATIVE NORMAL DISTRIBUTION

____

Pr(x) =P(x)/P(xmax)= e ^–^{x^2/2S^2} M =1 P(xmax) = 1/(SÖ`2`p)

+b +b

ò_-b Pr(x)dx = A ò_-b P(x)dx = 1 P(xmax)= 1/A

ELLIPTIC AND HYPERBOLIC NORMAL AND LOG NORMAL DISTRIBUTIONS

ln Pr(x) = - (1/2S^2 ).f ²(x) f(x) =x Normal f(x)= ln x log normal

P(x) = e ^–^{f (x)^2/2S^2}

_________

Elliptic ln Pr(x) == - c(1 - Ö 1- (f(x)/b)²) Borders : ± b normal

_________ e^±b log normal

Hyperbollic ln Pr(x) == c(1 - Ö 1+ (f(x)/b)²)

GENERAL EXPRESSION OF LOG NORMAL DISTRIBUTIONS

___

P(y) = (1/ SÖ2p).e ^{-(S^2/2).(1+z2))} . y ^-z.e ^{- ln2 y/2.S^2} y = M₀ e^{z .S^2} M₀ =1

CENTRIFUGAL DISTRIBUTIONS

_______ ________

ln Pr(x) = -cÖ 1- (x/b)² ln Pr(x) = -cÖ 1- (ln x/b)² M₀ =1

INDEX