The number frequency of occurrence of each of the bases A, C, G, T in successive block lengths of 50 bases of Drosophila DNA base sequence exhibit selfsimilar *fractal* fluctuations generic to dynamical systems in nature. Continuous periodogram power spectral analyses of the frequency distribution of bases A, C, G, T in Drosophila DNA base sequence show that the power spectra follow the universal inverse power-law form of the statistical normal distribution. Inverse power-law form for power spectra of space-time fluctuations is generic to dynamical systems in nature and is identified as *self-organized criticality*. The author has developed a general systems theory which provides universal quantification for observed *self-organized criticality* in terms of the statistical normal distribution. The long-range correlations intrinsic to observed *self-organized criticality* is a signature of quantumlike chaos in macro-scale dynamical systems. The results of power spectral analyses are in agreement with the following theoretical predictions. (1) The apparently irregular (chaotic) fluctuations self-organize to form an overall logarithmic spiral trajectory with the quasiperiodic *Penrose* tiling pattern for the internal structure. (2) Conventional power spectral analyses resolves such a spiral trajectory as an eddy continuum with embedded dominant wavebands with progressive increase in phase and bandwidth. The dominant peak periodicities are functions of the *golden mean*.

The important result of the present study is that the observed *fractal* frequency distributions of the bases A, C, G, T of Drosophila DNA base sequence exhibit long-range spatial correlations or *self-organized criticality* generic to dynamical systems in nature. Therefore, artificial modification of the DNA base sequence structure at any location may have significant noticeable effect on the function of the DNA molecule as a whole. Further, the presence of non-coding *introns* may not be redundant, but serve to organize the effective functioning of the coding *exons* in the DNA molecule as a complete unit.

Heredity in living organisms is determined by a long complex chemical molecule called DNA (deoxyribonucleic acid). The units of heredity, the genes are parts of the DNA molecule situated along the length of the chromosomes inside the nucleus of the cell. A simplified picture of the molecule of DNA may be visualised to consist of two long backbones with projections sticking out from them at right angles rather like a ladder with its two upright sides and its rungs. The backbones are made up of two simple chemicals arranged alternately - *sugar* - *phosphate* - *sugar* - *phosphate* - all allong the way. The projections are the four units or 'letters' of the code; they are four chemicals bases called *guanine*, *cytosine*, *adenine* and *thymine* - G, C, A, T. These four bases are arrangeed in a specific sequence which constitutes the genetic code. The DNA molecule actually consists not of a single thread, but of two helical threads wound around each other - *a double helix*. The two DNA chains run in opposite directions and are coiled around each other with the bases facing one another in pairs. Only specific pairs of bases can be linked together, T always pairs with A, and G with C (Claire, 1964; Bates and Maxwell, 1993). The amount of A is the same as the amount of T, while the amount of G is the same as the amount of C. These are now known as Chargaff ratios (Gribbin, 1985; Alcamo, 2001).

What distinguishes one type of cell from another and one organism from another is the protein which it contains. And it is DNA which dictates to the cell how many and what types of protein it shall make. Twenty different chemicals called *amino acids* in different sets of combinations form the proteins. The sequence of bases along each DNA molecule in the chromosome determines the sequence of *amino acids* along each of the proteins. It takes a sequence of *3* bases, the *codon*, to identify one *amino acid*. The order in which these bases recur within a particular gene in the helix corresponds to the information needed to build that gene's particular protein (Claire, 1964; Leone, 1992; Ball, 2000).

The genes of higher organisms are seldom 'recorded' in the chromosomes intact, but are scattered in fragmentary fashion along a stretch of DNA, broken up by chunks of DNA which seem at first sight to carry no message at all. All the useless or "junk" DNA, the intervening sequences are known as *introns*. The pieces of DNA carrying genetic code are called *exons*. The *codons*, *64* in number are distributed over the coding parts of the DNA sequences. It is well known that the coding regions are translated into proteins. The non-coding parts are presumed important in regulatory and promotional activities. The biologically meaningful structures in non-coding regions are not known (Gribbin, 1985; Guharay *et* *al*. 2000; Clark, 2001; Som *et* *al*., 2001). Understanding genetic defects will make it easier to treat them (Watson, 1997).

Historically, Watson and Crick (1953) put together all the experimental data concerning DNA and decided that the only structure that fitted all the facts was the double helix and postulated that DNA is composed of two ribbonlike "backbones" composed of alternating deoxyribose and phosphate molecules. They surmised that nucleotides extend out from the backbone chains and that the *0.34*nm distance represents the space between successive nucleotides. The X-ray data showed a distance of *34*nm between turns, so they guessed that ten nucleotides exist per turn. One strand of DNA would only encompass *1*nm width, so they postulated that DNA is composed of two strands to conform to the *2*nm diameter observed in the X-ray diffraction photographs. Scientists now agree that DNA is arranged as a double helix of two intertwined chains, with complementary bases (A-T and G-C) opposing each other. Moreover, the strands run opposite to one another, that is, the strands display the reverse polarity. They are said to be "antiparallel". Given the base sequence of one chain of DNA, the base sequence of its partner chain is automatically determined by simply noting which bases are complimentary (adenine-thymine or cytosine-guanine). Furthermore, the structure provides a mechanism by which one chain can serve as a template (a model or pattern) for the synthesis of the other chain (Sambamurty, 1999; Alcamo, 2001). The genomic DNA in cells must be highly compacted in order to be contained in the required space. Each chromosome appears to contain a single giant molecule of DNA. At least three levels of condensation are required to package the 10^{3} to 10^{5} micrometer of DNA in a eukaryotic (higher organism) chromosome into a metaphase structure a few microns long. The first level of condensation involves packaging DNA as a supercoil into nucleosomes. This produces *10*nm diameter interphase chromatin fiber. Second level of condensation involves an additional folding and/or supercoiling of the *10*nm nucleosome fiber to produce the *30*nm chromatin fiber. This third level of condensation appears to involve the segregation of segments of the giant DNA molecules present in eukaryotic chromosomes into independently supercoiled domains or loops. The mechanism by which this third level of condensation occurs is not known (Sambamurty, 1999).

DNA topology is of fundamental importance for a wide range of biological processes (Bates and Maxwell, 1993). One big question in DNA research is whether there is some meaning to the order of the base pairs in DNA. Human DNA has become a fascinating topic for physicists to study. One reason for this fascination is the fact that when living cells divide the DNA is replicated exactly. This is interesting because approximately *95%* of human DNA is called "junk" even by biologists who specialise in DNA. One practical task for physicists is simply to identify which sequences within the molecule are the coding sequences. Another scientific interest is to discover why the "junk" DNA is there in the first place. Almost everything in biology has a purpose that, in principle, is discoverable (Stanley, 2000). The study of statistical patterns in DNA sequences is important as it may improve our understanding of the organization and evolution of life on the genomic level. Recent studies indicate that the DNA sequence of letters A, C, G and T does have a 1/f^{a} frequency spectrum. It is possible, therefore, that the sequences have long-range order and underlying grammar rules. The opinion on this issue remains divided (Som *et* *al*., 2001 and all references therein). The findings of long-range correlations in DNA sequences have attracted much attention, and attempts have been made to relate those findings to known biological features such as the presence of triplet periodicities in protein-coding DNA sequences, the evolution of DNA sequences, the length distribution of protein-coding regions, or the expansion of simple sequence repeats (Holste *et al*., 2001).

A summary of recent results relating to long-range correlation (LRC) in DNA sequences is given in the following. Based on spectral analyses, Li *et* *al*. found ( Li, 1992; Li and Kaneko, 1992; Li, Marr and Kaneko, 1994) that the frequency spectrum of a DNA sequence containing mostly *introns* shows 1/f^{a} behavior, which evidences the presence of long-range correlations. The correlation properties of coding and noncoding DNA sequences were first studied by Peng *et* *al*. (1992) in their *fractal* landscape or DNA walk model. Peng *et* *al*. (1992) discovered that there exists LRC in noncoding DNA sequences while the coding sequences correspond to a regular random walk. By doing a more detailed analysis of the same data set, Chatzidimitriou-Dreismann and Larhammar (1993) concluded that both coding and noncoding sequences exhibit LRC. A subsequent work by Prabhu and Claverie (1992) also substantially corroborates these results. Buldyrev *et* *al*. (1995) showed the LRC appears mainly in noncoding DNA using all the DNA sequences available. Alternatively, Voss (1992; 1994), based on equal-symbol correlation, showed a power-law behavior for the sequences studied regardless of the percent of *intron* contents. Havlin *et al*. (1995) state that DNA sequence in genes containing noncoding regions is correlated, and that the correlation is remarkably long range--indeed, base pairs thousands of base pairs distant are correlated. Such long-range correlations are not found in the coding regions of the gene. Havlin *et al*. (1995) suggest that noncoding regions in plants and invertebrates may display a smaller entropy and larger redundancy than coding regions, further supporting the possibility that noncoding regions of DNA may carry biological information. Investigations based on different models seem to suggest different results, as they all look into only a certain aspect of the entire DNA sequence. It is therefore important to investigate the degree of correlations in a model-independent way. Hence one may ignore the composition of the four kinds of bases in coding and noncoding segments and only consider the rough structure of the complete genome or long DNA sequences. Yu *et al*. (2000) proposed a time series model based on the global structure of the complete genome and considered three kinds of length sequences. The values of the exponents from these three kinds of length sequences of bacteria indicate that the long-range correlations exist in most of these sequences (Yu *et* *al*., 2000 and all the references contained therein). Recently from a systematic analysis of human *exons*, coding sequences (CDS) and *introns*, Audit *et* *al*. (2001) have found that power law correlations (PLC) are not only present in noncoding sequences but also in coding regions somehow hidden in their inner codon structure. If it is now well admitted that long-range correlations do exist in genomic sequence, their biological interpretation is still a continuing debate (Audit *et* *al*., 2001 and all references therein).

The long-range correlation does not necessarily imply a deviation from Gaussianity. For example, the fractional Brownian motion which has Gaussian statistics shows an inverse power law spectrum. According to Allegrini *et* *al*. (1996, based on Levy’s statistics), long-range correlations would imply a strong deviation from Gaussian statistics while the investigation of Arneodo *et* *al*. (1995) yields an important conclusion that the DNA statistics are essentially Gaussian (Mohanty and Narayana Rao, 2000).

In visualizing very long DNA sequences, including the complete genomes of several bacteria, yeast and segments of human genes, it is seen that *fractal*-like patterns underly these biological objects of prominent importance. The method used to visualize genomes of organisms may well be used as a convenient tool to trace, e.g., evolutionary relatedness of species (Hao *et al*., 2000). Stanley, Amaral *et* *al*. (1996) and Stanley, Afanasyev *et al*. (1996) discuss examples of complex systems composed of many interacting subsystems which display nontrivial long-range correlations or long-term "memory". The statistical properties of DNA sequences, heartbeat intervals, brain plaque in Alzheimer brains, and fluctuations in economics have the common feature that the guiding principle of scale invariance and universality appear to be relevant (Stanley, 2000).

Irregular (nonlinear) fluctuations on all scales of space and time are generic to dynamical systems in nature such as fluid flows, atmospheric weather patterns, heart beat patterns, stock market fluctuations, etc. Mandelbrot (1977) coined the name *fractal* for the non-Euclidean geometry of such fluctuations which have fractional dimension, for example, the rise and susequent fall with time of the Dow Jones Index or rainfall traces a zig-zag line in a two-dimensional plane and therefore has a *fractal* dimension greater than one but less than two. Mathematical models of dynamical systems are nonlinear and finite precision computer realisations exhibit sensitive dependence on initial conditions resulting in chaotic solutions, identified as *deterministic chaos*. *Nonlinear dynamics and chaos* is now (since 1980s) an area of intensive research in all branches of science (Gleick, 1987). The *fractal* fluctuations exhibit scale invariance or selfsimilarity manifested as the widely documented (Bak, Tang, Wiesenfeld, 1988; Bak and Chen, 1989; 1991; Schroeder, 1991; Stanley, 1995; Buchanan,1997) inverse power law form for power spectra of space-time fluctuations identified as *self-organized criticality* by Bak *et al*. (1987). The power-law is a distinctive experimental signature seen in a wide variety of complex systems. In economy it goes by the name *fat tails*, in physics it is referred to as *critical fluctuations*, in computer science and biology it is *the edge of chaos*, and in demographics it is called *Zipf's law* (Newman, 2000). Power-law scaling is not new to economics. The power-law distribution of wealth discovered by Vilfredo Pareto (1848-1923) in the 19^{th} century (Eatwell, Milgate and Newman, 1991) predates any power-laws in physics (Farmer, 1999). One of the oldest scaling laws in geophysics is the *Omori law* (Omori, 1895). It describes the temporal distribution of the number of aftershocks which occur after a larger earthquake (i.e., mainshock) by a scaling relationship.The other basic empirical seismological law, the *Gutenberg-Richter law* (Gutenberg and Richter, 1944) is also a scaling relationship, and relates intensity to its probability of occurrence (Hooge *et. al*., 1994). Time series analyses of global market economy also exhibits power-law behaviour ( Bak *et al*., 1992; Mantegna and Stanley, 1995; Sornette *et al*., 1995; Chen, 1996a,b; Stanley, Amaral, Buldyrev, Havlin *et al*., 1996; Feigenbaum and Freund, 1997a,b; Gopikrishnan *et al*., 1999; Plerou *et al*., 1999; Stanley *et al*., 2000; Feigenbaum, 2001a,b) with possible *multifractal* structure ( Farmer, 1999 ) and has suggested an analogy to fluid turbulence (Ghashghaie *et al*., 1996; Arneodo *et al*., 1998). Sornette *et al*. (1995) conclude that the observed power-law represents structures similar to '*Elliott waves*' of technical analysis first introduced in the 1930s. It describes the time series of a stock price as made of different *waves*, these *waves* are in relation to each other through the *Fibonacci* series. Sornette *et al*. (1995) speculate that '*Elliott waves*' could be a signature of an underlying critical structure of the stock market. Incidentally the *Fibonacci* series represent a *fractal* tree-like branching network of selfsimilar structures (Stewart, 1992). The commonly found shapes in nature are the helix and the dodecahedron (Muller and Beugholt,1996) which are signatures of selfsimilarity underlying *Fibonacci* numbers. The general systems theory presented in this paper shows (Section 2) that *Fibonacci* series underlies *fractal* fluctuations on all space-time scales. * * Historically, basic similarity in the branching (*fractal*) form underlying the individual leaf and the tree as a whole was identified more than three centuries ago in botany (Arber,1950). The branching (bifurcating) structure of roots, shoots, veins on leaves of plants, etc., have similarity in form to branched lighting strokes, tributaries of rivers, physiological networks of blood vessels, nerves and ducts in lungs, heart, liver, kidney, brain ,etc. (Freeman, 1987; 1990; Goldberger *et al*., 1990; Jean, 1994; ). Such seemingly complex network structure is again associated with *Fibonacci* numbers seen in the exquisitely ordered beautiful patterns in flowers and arrangement of leaves in the plant kingdom (Jean, 1994; Stewart, 1995). The identification of physical mechanism for the spontaneous generation of mathematically precise, robust spatial pattern formation in plants will have direct applications in all other areas of science (Mary Selvam, 1998). The importance of scaling concepts were recognized nearly a century ago in biology and botany where the dependence of a property *y* on size *x* is usually expressed by the allometric equation *y=ax ^{b}* where

Continuous periodogram power spectral analyses of the frequency distribution of bases A, C, G, T in Drosophila DNA base sequence agree with model prediction, namely, the power spectra follow the universal inverse power law form of the statistical normal distribution. The geometrical distribution of the DNA bases therefore exhibit

As mentioned earlier (Section 1.3) power spectral analyses of *fractal* space-time fluctuations of dynamical systems exhibits inverse power-law form, i.e., a selfsimilar eddy continuum. The *cell dynamical system model* (Mary Selvam, 1990; Selvam and Fadnavis, 1998, and all references contained therein; Selvam, 2001a, b) is a general systems theory (Capra, 1996) applicable to dynamical systems of all size scales. The model shows that such an eddy continuum can be visualised as a hierarchy of successively larger scale eddies enclosing smaller scale eddies. Eddy or wave is characterised by circulation speed and radius. Large eddies of root mean square (r.m.s) circulation speed *W* and radius *R* form as envelopes enclosing small eddies of r.m.s circulation speed *w _{*}* and radius

(1)

Since the large eddy is but the average of the enclosed smaller eddies, the eddy energy spectrum follows the statistical normal distribution according to the *Central Limit Theorem* (Ruhla, 1992). Therefore, the variance represents the probability densities. Such a result that the additive amplitudes of the eddies, when squared, represent the probabilities is an observed feature of the subatomic dynamics of quantum systems such as the electron or photon (Maddox 1988a, 1993; Rae, 1988). The *fractal* space-time fluctuations exhibited by dynamical systems are signatures of quantumlike mechanics. The cell dynamical system model provides a unique quantification for the apparently chaotic or unpredictable nature of such *fractal* fluctuations ( Selvam and Fadnavis, 1998). The model predictions for quantumlike chaos of dynamical systems are as follows.

(a) The observed *fractal* fluctuations of dynamical systems are generated by an overall logarithmic spiral trajectory with the quasiperiodic *Penrose tiling pattern* (Nelson, 1986; Selvam and Fadnavis, 1998) for the internal structure.

(b) Conventional continuous periodogram power spectral analyses of such spiral trajectories will reveal a continuum of periodicities with progressive increase in phase.

(c) The broadband power spectrum will have embedded dominant wave-bands, the bandwidth increasing with period length. The peak periods (or length* *scales) *E _{n}* in the dominant wavebands will be given by the relation

*E _{n}=T_{s}(2+*t )t

(2)

where t is the *golden mean* equal to *(1+*Ö 5)/2 [@ 1.618] and *T _{s}* , the primary perturbation length scale. Considering the most representative example of turbulent fluid flows, namely, atmospheric flows, Ghil (1994) reports that the most striking feature in climate variability on all time scales is the presence of sharp peaks superimposed on a continuous background.

The model predicted periodicities (or length scales) in terms of the primary perturbation length scale units are *2.2*, *3.6*, *5.8*, *9.5*, *15.3*, *24.8*, *40.1*, *64.9*, *105.0 *respectively for values of *n* ranging from *-1 to 7*. Periodicities (or length scales) close to model predicted have been reported in weather and climate variability (Burroughs, 1992; Kane, 1996), prime number distribution (Selvam, 2001a), Riemann zeta zeros (non-trivial) distribution (Selvam, 2001b).

Sornette *et al*. (1995) also conclude that the observed power law represents structures similar to '*Elliott waves*' of technical analysis first introduced in the 1930s. It describes the time series of a stock price as made of different *waves*, these *waves* are in relation to each other through the *Fibonacci* series. Sornette *et al*. (1995) speculate that '*Elliott waves*' could be a signature of an underlying critical structure of the stock market.

(d) The length scale ratio *r/R* also represents the increment *d*q in phase angle q (Equation 1 ). Therefore the phase angle* *q represents the variance. Hence, when the logarithmic spiral is resolved as an eddy continuum in conventional spectral analysis, the increment in wavelength is concomitant with increase in phase (Selvam and Fadnavis, 1998). Such a result that increments in wavelength and phase angle are related is observed in quantum systems and has been named *'Berry's phase'* (Berry 1988; Maddox 1988b; Simon *et al*., 1988; Anandan, 1992). The relationship of angular turning of the spiral to intensity of fluctuations is seen in the tight coiling of the hurricane spiral cloud systems.

The overall logarithmic spiral flow structure is given by the relation

(3)

where the constant *k*** **is the steady state fractional volume dilution of large eddy by inherent turbulent eddy fluctuations . The constant *k*** **is equal to *1/*t^{2}*(**@0.382*) and is identified as the *universal constant* for deterministic chaos in fluid flows (Selvam and Fadnavis, 1998).The steady state emergence of *fractal* structures is therefore equal to

*1/k @2.62*

(4)

The model predicted logarithmic wind profile relationship such as Equation 3 is a long-established (observational) feature of atmospheric flows in the atmospheric boundary layer, the constant *k*, called the *Von Karman* ’s constant has the value equal to *0.38* as determined from observations (Wallace and Hobbs, 1977).

In Equation 3, *W* represents the standard deviation of eddy fluctuations, since *W* is computed as the instantaneous r.m.s. ( root mean square) eddy perturbation amplitude with reference to the earlier step of eddy growth. For two successive stages of eddy growth starting from primary perturbation *w _{*}* the ratio of the standard deviations

*statistical normalized standard deviation t=0,1,2,3, etc.*

(5)

The conventional power spectrum plotted as the variance versus the frequency in log-log scale will now represent the eddy probability density on logarithmic scale versus the standard deviation of the eddy fluctuations on linear scale since the logarithm of the eddy wavelength represents the standard deviation, i.e., the r.m.s. value of eddy fluctuations (Equation 3). The r.m.s. value of eddy fluctuations can be represented in terms of statistical normal distribution as follows. A normalized standard deviation *t=0* corresponds to cumulative percentage probability density equal to *50* for the mean value of the distribution. Since the logarithm of the wavelength represents the r.m.s. value of eddy fluctuations the normalized standard deviation *t* is defined for the eddy energy as

(6)

where *L* is the wavelength (or period) and *T _{50}* is the wavelength (or period) up to which the cumulative percentage contribution to total variance is equal to

The periodicities (or length scales) *T _{50}* and

The power spectrum, when plotted as normalised standard deviation *t* versus cumulative percentage contribution to total variance represents the statistical normal distribution (Equation 6), i.e., the variance represents the probability density. The normalised standard deviation values *t* corresponding to cumulative percentage probability densities *P* equal to *50* and *95* respectively are equal to *0* and *2* from statistical normal distribution characteristics. Since *t* represents the eddy growth step *n* (Equation 5) the dominant periodicities (or length scales) *T _{50}* and

*T _{50} = (2+t )t^{0 }@ 3.6 unit length segment of 50 bases*

(7)

*T _{95} = (2+t )t^{2 }@ 9.5 unit length segment of 50 bases*

(8)

The above model predictions are applicable to all real world and computed model dynamical systems. Continuous periodogram power spectral analyses of number frequency (per 50 bases) of occurrence of bases A, C, G, T in Drosophila DNA base sequence at different locations along its length give results in agreement with the above model predictions.

The Drosophila DNA base sequence was obtained from *Berkeley Drosophila Genome Project* (BGDP Resources at http://www.fruitfly.org/index.html. The data set used for the study corresponds to the file NA_ARMS~1 with the title : >2L, 28-11-2001.1 (22207800 bases) segment 1 of 1 for arm 2L on wed Nov 28 00: 30 : 01 PST 2001 (http://www.fruitfly.org/sequence/sequence_db/na_arms.dros. RELEASE 2.9) finished sequence for 2L. The first *225000* bases were used to give *50* data sets each of length *4500* bases. The number of times that each of the bases A, C, G, T occur in successive blocks of *50* bases was determined for each data set of *4500* bases. Each data set of *4500* bases then gives *4* groups of *90* frequency sequence values corresponding respectively to the four bases A, C, G, T.

A representative sample for the frequency of occurrence of base A in successive blocks of length *50* bases is plotted in Figure 1 for *10*, *100*, *1000* and *4500* segments for the total sequence consisting of *225000* bases used in the study. The frequency distribution shows irregular or *fractal* fluctuations for all the segment length scales. The irregular fluctuations may be visualised to result from the superimposition of an ensemble of eddies (wavelengths).

Figure 1: Representative example for *fractal* fluctuations exhibited by frequency distribution of base A in 10 to 4500 data sets

The frequency distribution of bases A, C, G, T follow *statistical normal distribution* (Selvam and Suvarna Fadnavis, 2001) as described in the following. Each data set consists of the frequency distribution *X _{j}* where

The cumulative frequency of occurrence *p _{j}* of base (A, C, G or T) for class intervals

The cumulative percentage frequency of occurrence *p _{c}* of base (A, C, G or T) for class intervals

The graph of cumulative percentage frequency of occurrence *p _{c}* versus the corresponding normalised standard deviation

it is seen that the length scale ratio *r/R* (or frequency ratio) represents the variance spectrum (*W ^{2}/w_{*}^{2}*) and therefore the cumulative frequency distribution follows closely the cumulative normal distribution as shown in Figure 2.

Figure 2: The cumulative percentage frequency of occurrence of bases A, C, G, T in Drosophila DNA sequence follow closely the statistical normal distribution

The broadband power spectrum of space-time fluctuations of dynamical systems can be computed accurately by an elementary, but very powerful method of analysis developed by Jenkinson (1977) which provides a quasi-continuous form of the classical periodogram allowing systematic allocation of the total variance and degrees of freedom of the data series to logarithmically spaced elements of the frequency range (*0.5*, *0*). The periodogram is constructed for a fixed set of *10000*(*m*) wavelengths (or periodicities) *L _{m}* which increase geometrically as

*t _{m} = (log L_{m} / log T_{50})-1*

The cumulative percentage contribution to total variance, the cumulative percentage normalized phase (normalized with respect to the total phase rotation) and the corresponding *t _{m}* values were computed. The power spectra were plotted as cumulative percentage contribution to total variance versus the normalized standard deviation

The average variance and phase spectra for the *50* data sets used in the study along with statistical normal distribution are shown in Figure 3 for the four bases A, C, G, T. The 'goodness of fit' (statistical chi-square test) between the variance spectra and statistical normal distribution is significant at less than or equal to *5%* level for all the variance spectra. The eddy variance spectra following statistical normal distribution is a signature of quantumlike chaos (see Section 2) in the frequency distribution sequence of bases A, C, G, T in Drosophila DNA base sequence arrangement. Phase spectra are close to the statistical normal distribution, with the 'goodness of fit' being statistically significant for *42*, *36*, *48* and *42* percent of data sets respectively for the four bases A, C, G, T. However, in all the cases, the 'goodness of fit' between variance and phase spectra are statistically significant (chi-square test) for individual dominant wavebands, in particular for shorter wavelengths as shown in Figure 6. Eddy variance spectra following phase spectra is identified as *Berry's phase* and is also a signature of quantumlike chaos (see Section 1, Equation 1). The data sets which do not exhibit *Berry's phase* are indicated in Figure 9.

Figure 3: Average variance (continuous line) and phase (dashed line) spectra for the bases A, C, G, T for the *50* data sets used in the study. The statistical normal distribution ( open circles) is also shown.

The power spectra exhibit dominant wavebands where the normalised variance is equal to or greater than *1*. The dominant peak wavelengths (periodicities) were grouped into class intervals *2 - 3*, *3 - 4*, *4 - 6*, *6 - 12*, *12 - 20*, *20 - 30*, *30 - 50*, *50 - 80*, *80 - 120* . These class intervals include the model predicted (Equation 2) dominant peak periodicities (or length scales) *2.2*, *3.6*, *5.8*, *9.5*, *15.3*, *24.8*, *40.1*, *64.9*, *105.0*, (in block length segment unit of *50* bases) for values of *n* ranging from *-1 to 7*. Wavelength class interval-wise percentage frequency of occurrence of dominant periodicities were computed. In each class interval, the number of dominant statistically significant (less than or equal to *5%*) periodicities and also the number of dominant wavebands which exhibit *Berry*'s phase (variance and phase spectra are the same) are computed as percentages of the total number of dominant wavebands in each class interval. The class interval-wise mean and standard deviation of the above computed frequency distribution of dominant periodicities, significant dominant periodicities and dominant periodicities exhibiting *Berry*'s phase (see Section 2) were then computed for the four bases A, C, G, T in the Drosophila DNA sequence. The average class interval-wise distribution of dominant wavelengths (periodicities), significant dominant wavelengths and dominant wavelengths exhibiting *Berry*'s phase respectively are shown in Figures 4, 5 and 6.

Figure 4: Average wavelength class interval-wise distribution of dominant wavebands for the four bases A, C, G, T in the *50* data sets (a total of *225000* bases) of Drosophila DNA base sequence used for the study

Figure 5: Average wavelength class interval-wise distribution of dominant significant wavebands for the four bases A, C, G, T in the *50* data sets (a total of *225000* bases) of Drosophila DNA base sequence used for the study

Figure 6: Average wavelength class interval-wise distribution of dominant wavebands exhibiting *Berry's phase* for the four bases A, C, G, T in the *50* data sets (a total of *225000* bases) of Drosophila DNA base sequence used for the study

The model predicts that the apparently irregular *fractal* fluctuations contibute to the ordered growth of the quasiperiodic *Penrose* tiling pattern with an overall logarithmic spiral trajectory such that the successive radii lengths follow the *Fibonacci* mathematical series. Conventional power spectral analyses resolves such a spiral trajectory as an eddy continuum with embedded dominant wavebands, the bandwidth increasing with wavelength. The progressive increase in the radius of the spiral trajectory generates the eddy bandwidth proportional to the increment *d*q in phase angle equal to *r/R.* The relative eddy circulation speed *W/w _{*}* is directly proportional to the relative peak wavelength ratio R

Considering eddy growth with overall logarithmic spiral trajectory

The eddy circulation speed is related to eddy radius as

The relative peak wavelength is given in terms of eddy circulation speed as

From Equation (1) the relationship between eddy bandwidth and peak wavelength is obtained as

(9)

A log-log plot of peak wavelength versus bandwidth will be a straight line with a slope (bandwidth/peak wavelength) equal to *2*. A log-log plot of the average values of bandwidth versus peak wavelength shown in Figure 7 exhibits a constant slope approximately equal to *2* in agreement with the above model prediction.

Figure 7: Log-log plot of average values of bandwidth versus peak wave length for the four bases A, C, G, T. The slope (bandwidth/peak wavelength) of this graph, also plotted in the above figure shows an approximately constant value equal to about *2*.

The mean and standard deviation of the frequency distribution for bases A, C, G, T for all the *50* data sets are given in Figure 8 below. Each data set consists of a sequence of *90* frequency values corresponding to *90* successive block lengths of *50* bases of Drosophila DNA base sequence.

Figure 8

The periodicities *T _{50}* up to which the cumulative percentage contribution to total variance is equal to

Figure 9: The periodicities T* _{50}* up to which the cumulative percentage contribution to total variance is equal to

The number frequency of occurrence of each of the bases A, C, G, T in successive block lengths of *50* bases of Drosophila DNA base sequence exhibit selfsimilar *fractal* fluctuations generic to dynamical systems in nature. The apparently irregular (chaotic) *fractal* fluctuations which characterise the fine-scale geometry of spatial structures in nature is now an intensive field of study in the new science of *nonlinear dynamics and chaos*. The *fractal* fluctuations are basically a zig-zag pattern of successive upward and downward swings such as that shown in Figure 1 for the frequency distribution of bases A, C, G, T for all data lengths, i.e., number of blocks ranging from *10* to the maximum *4500*, a total of *225000* Drosophila DNA base sequence. Such irregular fluctuations may be visualised to result from the superimposition of a continuum of eddies. Power spectral analysis is commonly applied to resolve the component wavelengths and their phases, the wavelengths being given in terms of the unit block length of *50* bases used for determining the wavelength distribution. Continuous periodogram power spectral analyses of the *fractal* fluctuations in the frequency distribution of bases A, C, G, T in Drosophila DNA base sequence follow closely the following model predictions given in Section 2.

Incidentally physics at the atomic scale is determined by the rules of quantum mechanics, which tells us that particles propagate like waves, and so can be described by a quantum mechanical wave function (Rae, 1999). As an immediate consequence, a particle can be in two or more states at the same time - a so-called superposition of states. This curious behaviour has been hugely successful in describing physical systems at the microscopic level. For example, under the rules of quantum mechanics two atoms sharing an electron form a chemical bond, whereas in classical theory the electron remains confined to one atom and the bond cannot form (Blatter, 2000).

Power spectra of frequency distribution of bases A, C, G, T of Drosophila DNA base sequence follow the model predicted universal and unique inverse power law form of the statistical normal distribution.

Inverse power-law form for power spectra generic to *fractal* fluctuations is a signature of* self-organized criticality* in dynamical systems in nature. The author had shown earlier (Selvam and Suvarna Fadnavis, 1998; Selvam 2001a, b) that (a) *self-organized criticality* can be quantified in terms of the universal inverse power-law form of the statistical normal distribution and (b) *self-organized criticality* of selfsimilar *fractal* fluctuations implies long-range space-time correlations and is a signature of quantumlike chaos in macro-scale dynamical systems of all space-time scales.

Inverse power-law form for power spectra of fluctuations in spatial distribution of bases A, C, G, T imply long-range spatial correlations, or in other words, persistence or long-term (length scale) memory of short-term fluctuations. The fine scale structure of longer length scale fluctuations carry the signature of shorter length scale fluctuations. The cumulative integration of shorter length scale fluctuations generates longer length scale fluctuations (eddy continuum) with two-way ordered energy feedback between the fluctuations of all length scales (Equation 1). The eddy continuum acts as a robust unified whole fuzzy logic network with global response to local perturbations. Increase in random noise or energy input into the short-length scale fluctuations creates intensification of fluctuations of all other length scales in the eddy continuum and may be noticed immediately in shorter length scale fluctuations. Noise is therefore a precursor to signal.

Real world examples of noise enhancing signal has been reported in electronic circuits (Brown, 1996). Man-made, urbanisation related, greenhouse gas induced global warming (enhancement of small-scale fluctuations) is now held responsible for devastating anomalous changes in regional and global weather and climate in recent years (Selvam and Fadnavis, 1998). Noise and fluctuations are at the seat of all physical phenomena. It is well known that, in linear systems, noise plays a destructive role. However, an emerging paradigm for nonlinear systems is that noise can play a constructive role—in some cases information transfer can be optimized at nonzero noise levels. Another use of noise is that its measured characteristics can tell us useful information about the system itself. Problems associated with fluctuations have been studied since 1826 (Abbott, 2001).

The apparently irregular *fractal* fluctuations of the frequency distribution of bases A, C, G, T in Drosophila DNA base sequence self-organize spontaneously to generate the robust geometry of logarithmic spiral with the quasiperiodic *Penrose* tiling pattern for the internal structure. Conventional power spectral analyses resolves such a logarithmic spiral geometry as an eddy continuum with embedded dominant wavebands, the peak periodicities being functions of the *golden mean* and the primary perturbation length scale equal to block length of *50* bases used in the present study. Power spectral analyses of the frequency distribution of bases A, C, G, T in Drosophila DNA base sequence also exhibit the model predicted dominant wavebands. These dominant periodicities are intrinsic to the selfsimilar *fractal* fluctuations (space-time) of dynamical systems in nature. Quantum systems are also characterised by continuous irregular space-time fluctuations analogous to *fractal* fluctuations of macro-scale dynamical systems (Hey and Walters, 1989).

The quasicrystalline structure of the quasiperiodic *Penrose* tiling pattern underlies the apparently irregular distribution of the bases A, C, G, T in Drosophila DNA base sequence. Historically, Schrodinger (1967) introduced a concept that the most essential part of a living cell - the chromosome fibre - may suitably called an aperiodic crystal (Gribbin, 1985). A periodic crystal, like one of common salt, can carry only a very limited amount of information. But an aperiodic crystal in which there is structure obeying certain fundamental laws, but no dull repetition can carry enormous amount of information (Gribbin, 1985). The space filling geometric figure of the *Penrose* tiling pattern has intrinsic local *five-fold* symmetry (Devlin, 1997) and also ten-fold symmetry. One of the three basic components of DNA, the deoxyribose is a five-carbon sugar and may represent the local five-fold symmetry of the quasicrystalline structure of the quasiperidic *Penrose* tiling pattern of the DNA molecule as a whole. The DNA molecule also shows tenfold symmetry in the arrangement of *10* bases per turn of the double helix (Watson and Crick, 1953). The study of plant phyllotaxis in botany shows that the quasicrystalline structure of the quasiperiodic *Penrose* tiling pattern provides maximum packing efficiency for seeds, florets, leaves, etc (Jean, 1994; Stewart, 1995; Mary Selvam, 1998). Quasicrystalline structure of the quasiperiodic *Penrose* tiling pattern may be the geometrical structure underlying the packing of *10 ^{3}* to

The important result of the present study is that the observed

The author is grateful to Dr. A. S. R. Murty for his keen interest and encouragement during the course of this study.

*Physical Review Letters* **74(16)**, 3293-3296. http://linkage.rockfeller.edu/wli/dna_corr/arneodo95.pdf