THE LOGNORMAL DISTRIBUTION (LND)
In the description of the LND
normally you find as expression of
P(x):
___
P(x) = (1/SxÖ2`p ). e –( ln x –l M) ^2 / 2S^2
in which S is the s of the normal distribution
(ND) of ln x. The median of the LND is M’ = eM .
Let see that if M=0,
M’ = 1.
The mode of this
LND is: M0= e M-S^2
( Aitchinson- Brown :The log normal Distribution with special...) that
is the same: M0 = M’.e- S^2 or M’ = M0.e
S^2
(Can be found too by
derivation)
The expression of P(x) can be
written too:
___
P(x) = (1/SxÖ2`p ). e –( ln x –ln e^ M) ^2 / 2S^2
The exponent is too: - ln2 (x/eM) / 2S2 and with M’ = e M
-ln2
(x/M’) / 2S2 .
Doing z=x/M’ x = z.M’ dx=M’dz
___
P(x)dx = (1/(S.z.M’Ö`2`p)). e- ln2 z /2S^2
M’dz = P(z) dz
Being z= x/ M0 es^2
Now we can do the change of
variate y= x/ M0 .
z = y e –s^2 and y = z .e S^2 dy = dz .e S^2 and dz = dy. e-S^2
The exponent becomes : - ln2
(y e –s^2) /2S2. and
____
P(z) = 1/(SzÖ`2`p) . e – (ln z^2) /2S^2 and
P(z) dz = e S^2/(Sy
Ö`2`p). e-(ln y –S2 )^2/2S2. e-s^2 dy=P(y) dy
We have (ln y –S2 ) 2 = ln2 y – 2ln y.S2 + S4 and
___
P(y) = 1/(Sy Ö`2`p). e- ln2 y/2S^2.eln
y . e –S^2/2 and as eln y = y, we have finally
___
P(y) = 1/(SÖ`2`p) . e (-S2^/2). e -
ln2 y/2S^2 (1)
____
For y=1 x = M0
the mode, and P(y)=P(ymax)
=1/(SÖ`2`p) . e -S^2/2.
We have the LND referred to
the mode and as we will see, it has
some special properties.
First, as ln y = - ln 1/y, and
ln2 y = (- ln 1/y) 2 the probability densities of
reciprocal values referred to the mode, are the same. That is, P(0.5 M0 )
is the same that
P(2 M0). It is easy to see
that M0 is the geometric mean of the distribution
of P(y).
As the mean m= e M
+ (S^2)/2 ; m = M’.e(S^2)/2
= M0.e S^2.e(S^2)/2 = M0. e(S^2).3/2
and M’== M0.e S^2 .
That gives a good
way to check if a curve distribution is a LND. First read the abscissa of M0,
let be X0; then take two points of the curve with the same ordinate
and look at the two abscises. If X1/X0
= X0/X2 the distribution can be LN.
The moments with
reference M0 are: Mj = M0 j . e j.S^2 e j^2.S^2
The relative
probability density function. (RLND)
The formula (1) can be written P(y) = P(ymax). e -
ln2 y/2S^2 and then
P(y) / P(ymax) = Pr(y) = e -
ln2 y/2S^2
Pr(y) is a new concept: the relative
probability density of the LND (RLND) in which the variate y is referred to the mode; that is to
say, y is measured in units of M0.
As ò P(y) dy = 1 it
results that ò Pr(y)dy = 1/P(ymax)
It results that it is easier
to work with the relative LND referred to the mode than in the way it is normally presented. You can calculate
the cumulative distribution function for Pr(y) and to find the value of P(y),
one has only to multiply by the value of P(ymax).
From the statistical point of
view, sometimes it is easy to estimate
the mode, and if the mean has been calculated, it is immediate the calculation
of the parameter S. As S has not a meaning in the distribution, I
use normally a=1/S as parameter, (some authors use it too) with the name of “concentration”. Some graphics for
different values of S are attached with the distributions of P(y) and
Pr(y).
The relative distribution
concept is also valid for the ND, being
in this case if M=0 Pr(x) = e-(x2/2S^2). (RND) (2)
It can be applied
to any distribution too.
Tangent
from the origin.
Being a RLND, let us calculate the tangent to the
curve from the origin of coordinates.
The slope of the
tangent in a point of the curve Pr(x) = e -
ln2 x/2S^2 is, derivating, P’r(x)= Pr(x).(-ln x /S2.x) and if it has to pass through the origin, it must be P’r(x)=Pr(x)/x. Then
Pr(x)/x = Pr(x). (- ln x )/S2x or -ln x/S2
=1 and ln x = -S2 and
x= e -S^2. But e -S^2 = M0 /M’ the inverse of the mean. This facilitates to find the mean when one has the curve of the
probability distribution. It is easy to draw examples.
Some
outcomes from the RND and RLND.
Elliptic Normal
Distributions (END)
In (2), doing Pr(x) =
y and taking logarithms we have:
ln y = - x2/2S2 that is the equation of a parabola with
parameter S2,
x2
= -2S2 ln y.
If the RND can be generated from a parabola, we can
assume that exists a distribution generated from an ellipse, being the parabola
a limit of an ellipse when the axis 2c (the major) and 2b (the
minor, parallel to x axis) increase infinitely
remaining b2/c constant. The half major axis is
designed as c to avoid confusion with the concentration a.
The equation of the equivalent
ellipse of the preceding parabola is:
(ln y + c)2 /c2
+ x2/ b2 =1
and from here:
_______
_____
ln y = - c(1 - Ö 1-
(x/b)2 ) and y = e- c(1 -
Ö 1- (x/b)2 ). The double sign
of the root is +, because for x = 0 must be y = Pr(max) =1 . This limits the
ellipse to the superior half.
This new concept of
distribution can be named as: elliptic normal distribution (END).
This type of distributions are limited by the values of x
=± b and they
look more realistic that the ND, because in the reality there is
not the possibility to find the
theoretical values of ±¥ for the variate that allow
the ND . The error in a measure can’t be µ because the measuring devices have always a finite size. This opinion agrees with that you indicate
in a web of the ND: (http://mathworl.wolfram.com/
)
Because they occur so frequently, there is an unfortunate tendency to
invoke normal distributions in situations where they may not be applicable. As
Lippmann stated, "Everybody believes in the exponential law of errors: the
experimenters, because they think it can be proved by mathematics; and the
mathematicians, because they believe it has been established by
observation" (Whittaker and Robinson 1967, p. 179).
If you draw a graphic with two
ND, one END and the other ND, for the
parameters c = 10, b=2 for the END and S= 0.5 for the ND, you can see than the shape of the curves is
almost the same. And for higher
concentrations, c=15, both curves are coincident.
Computer can do the
integration easily with the Simpson method. As all the elements of the distribution
must be included between –b and +b, we have:
òP (x) dx = 1. As ò Pr(x)dx =1/P(xmax)ò P(x)dx = A; then P(xmax) =1/A
Doing x/b = sin w
it is possible to simplify the expression of y=Pr(x):
As dx = b cos w then Pr(x) dx = Pr (w) dw = e -c(1-cos
w)cos w dw.
Elliptic Log Normal Distributions (ELND)
The same conclusion can be
done regarding the LND, because in this case ln y = - ln2 x/2S2.
The equivalent ellipse will be :
(ln y + c)2 /c2 + ln2 x/ b2 =1
________
and Pr(y) = e – c (1 - Ö1- (ln2 x/b2 ). In the case of the
ELND, the limits of the variate x, with the mode=1 are e –b and e +b. The ELND also has the property
indicated above that P(1/x) = P(x). The same opinion than for the END
can be done for the ELND: there
are the real ones, because in the real
world there are not variates which allow the ¥ value as it happens for the LND.
The integration must be made between the limits e ±b to find P(xmax)=
1/A
With the logarithms of probability as above, it is possible to do philosophy about entropy, information
and probability.
Hyperbolic distributions
In the same way than for the
elliptic distributions, we can find the expression for the hyperbolic distributions:
______
Pr(x) = e c(1-Ö 1+ (f(x)/b)2) This distribution is extended, as the
Normal Distribution, from -µ to +µ. The problem is to find
Pmax(x), because the integration of Pr(x) is not immediate. As it happens with
the elliptic distributions, with high concentration, both distributions are
coincident. In a graphic with relatives normal and hyperbolic normal distributions,
the shape is almost the same.
Centrifugal and Centripetal
Distributions
Looking at the expressions of
elliptic N and LN distributions, it is logical to think: what
about the lowest half of the ellipse? Could they be distributions with limited
edges +b and –b with the maximum probability
density in the borders than in the center?
With the maximum probability
density at the borders, the equation of such a distribution should be:
_______ ________
ln Pr(x) = -cÖ 1-
(x/b)2 and ln Pr(x) = -cÖ 1-
(ln x/b)2
Are there in the reality this
kind of distributions? Could be. I am looking for them. In any case they are “centrifugal distributions”. It is a subject to investigate. By other side, it is
logical that the distributions having points
at the ¥, as the N and LN
can’t be centrifugal, they must
be centripetal.
GENERAL EXPRESSION OF THE LND
Let be a LND in the relative form : Pr(x) = e - (1 /2S^2). ln2 x
__
with Pmax(x)= e-S^2/2 /SÖ2p Being the mode x=1
and P(x) dx = P(xmax).Pr(x)
dx
If we do the variate
change y = x. e z .S^2 being z a number
between - µ and + µ, we have x = y. e –z.S^2 and
dx = e –z.S^2 dy
Doing all the substitutions
and simplifications, we arrive at
__
P(y) = (1/ SÖ2p).e -(S^2/2).(1+z2))
. y -z .e - ln2 y/2.S^2
That gives P(y) for
the points separated of the mode e z.s^2 .
For z=0 we get
the expression for the RLND referred to the mode =1
For z=1 we get the
expression for the RLND referred to the
median=1
For z=3/2 we get the
expression for the RLND referred to the
mean=1
The
log normal distribution and the Plank distribution
When looking at the
shape of the Plank distribution, the
question arises about why it is not a log normal distribution.
The figure of the
Plank distribution as appears in the mentioned web, has been converted to a
relative log normal distribution. They
are very similar.
Plank obtained its
distribution formula adapting a curve to the observational data he had about
black body intensity of radiations. I read that Plank had problems to adapt his
curve to the real data
That web says that
log normal distributions appear in the size of silver emulsions for
photography. I know that appears too in the size of other products in
grinding engines, and other distributions about size of companies and so on.
See Atchinson and Brown and its Kolmogoroff references.
Then, if the
radiation produces photons of different sizes, why do not accept that the size
of photons have a log normal distribution? And the size is measured with the
wave length.
f(x)=15/(p4x5 (e1/x –1)
The x max =0.20145 and for
that value Pmax(x) = 3.264
The relative function is f(x)
/Pmax(x) with the variate change
y=x/0.20145 (Mode =1) and dx = 0.20145.dy
Once finished this report, I
saw a figure about probabilities with four dice. I have adjusted a Elliptic
Normal Distribution:
_________
P(n) = 0.1118745e -c(1-Ö
1-((n-14)/10)^2 ) with n varying from 4 to 24.
SUMMARY
LOG
NORMAL DISTRIBUTION
Median M’ =1
____
P(x) = (1/SxÖ2`p ). e –( ln x )^2 / 2S^2
____
P(xmax) = e S^2/2/(SÖ`2`p)
Mode M0
=1 M0 = M’e -S^2 m= M0 e –3/2(S^2)
_____ ___
P(x) = 1/(SÖ`2`p) . e (-S2^/2). e -
ln2 x / 2S^2 P(xmax) = e -S^2/2/(SÖ`2`p)
RELATIVE LOG NORMAL
DISTRIBUTION
____
Pr(x) =P(x)/P(xmax)= e -
ln2 x/2S^2 M0
=1 P(xmax) = e -S^2/2/(SÖ`2`p)
RELATIVE NORMAL
DISTRIBUTION
____
Pr(x) =P(x)/P(xmax)= e – x^2/2S^2 M =1 P(xmax) = 1/(SÖ`2`p)
+b +b
ò -b Pr(x)dx = A ò -b P(x)dx = 1 P(xmax)= 1/A
ELLIPTIC
AND HYPERBOLIC NORMAL AND LOG NORMAL DISTRIBUTIONS
ln Pr(x) = - (1/2S^2 ).f 2(x) f(x) =x
Normal f(x)= ln x log normal
P(x) = e –f (x)^2/2S^2
_________
Elliptic ln Pr(x) == - c(1 - Ö 1- (f(x)/b)2)
Borders : ± b normal
_________ e±b log normal
Hyperbollic ln Pr(x) ==
c(1 - Ö 1+ (f(x)/b)2)
GENERAL
EXPRESSION OF LOG NORMAL DISTRIBUTIONS
___
P(y) = (1/ SÖ2p).e -(S^2/2).(1+z2))
. y -z .e - ln2 y/2.S^2 y = M0 e z .S^2 M0 =1
CENTRIFUGAL
DISTRIBUTIONS
_______ ________
ln Pr(x) = -cÖ 1- (x/b)2
ln Pr(x) = -cÖ 1- (ln x/b)2 M0 =1